Toy Neural Network and Recurrent Neural Network Regression with Tensorflow (2)

Code: here    Data:here

Now let's build a more sophisticated RNN so that we can look back dynamically without handcrafting the k value as we did in the previous post.

At its simplest form, an RNN is defined as:

def RNN(_X, _istate, _weights, _biases):

    # input shape: (batch_size, n_steps, n_input)
    _X = tf.transpose(_X, [1, 0, 2])  # permute n_steps and batch_size
    # Reshape to prepare input to hidden activation
    _X = tf.reshape(_X, [-1, n_input]) # (n_steps*batch_size, n_input)
    # Linear activation
    _X = tf.matmul(_X, _weights['hidden']) + _biases['hidden']

    # Define a basic RNN cell with tensorflow
    basic_rnn_cell = rnn_cell.BasicRNNCell(n_hidden)
    # Split data because rnn cell needs a list of inputs for the RNN inner loop
    _X = tf.split(0, n_steps, _X) # n_steps * (batch_size, n_hidden)

    # Get lstm cell output
    outputs, states = rnn.rnn(basic_rnn_cell, _X, initial_state=_istate)

    # Linear activation
    # Get inner loop last output
    return tf.matmul(outputs[-1], _weights['out']) + _biases['out']

One thing to notice is that Tensorflow already has a RNNcell defined, so we just need to call the API. (Remember the old days when we have to write our own implementation? This is so much better! )

Of course we need to modify the data preparation code for a little bit. In the case of RNN the maximum number of steps we look back is a variable of our own choice. I set it to be 100. So we feed the network with a input output combination of inputsize * maximum_lookback_step.

# Network Parameters
n_input = 10
n_steps = 100 # timesteps
n_hidden = 128 # hidden layer num of features
n_out = 10 

def prepare_data(data, T):  # look back T steps max
    data_num = data.shape[1]
    data_size = data.shape[0]
    
    std = np.std(data, axis = 1)
    mean = np.mean(data, axis = 1)
    # first we need to normalize data (-1 to 1, demean and denormalization)
    for i in range(data_num):
        for j in range(data_size):
            data[j,i] = (data[j,i]-mean[j])/std[j]
    
    data_num = data_num - T  # we need to start at X[T] to look back T steps 
    input_size = (T, data.shape[0])
    output_size = data.shape[0]
    all_input = np.zeros((T, data.shape[0], data_num))
    all_output = np.zeros((output_size, data_num))
    
    for i in range(data_num):
        all_output[:,i] = data[:,i+T]
        for j in range(T):
            all_input[j, :, i] = data[:, i+T-j-1]
    
    # five fold cross-validation
    order = np.random.permutation(data_num)
    training_num = int(data_num*4/5)
    testing_num = data_num - training_num
    training_order = order[0:training_num]
    testing_order = order[training_num:data_num]
    
    training_input = all_input[:, :, training_order]
    training_output = all_output[:, training_order]
    
    testing_input = all_input[:, :, testing_order]
    testing_output = all_output[:, testing_order]
    
    return training_input.transpose((2,0,1)), training_output.transpose(), testing_input.transpose((2,0,1)), testing_output.transpose()

Now you can start training. It is pretty much the same as the previous post so I will not post it here.

The end result looks like this:

Iter 14000, Minibatch Loss= 0.162863
Iter 15000, Minibatch Loss= 0.164219
Iter 16000, Minibatch Loss= 0.181672
Iter 17000, Minibatch Loss= 0.129802
Iter 18000, Minibatch Loss= 0.144143
Iter 19000, Minibatch Loss= 0.157080

Note that we are using the same loss function as the NN implementation. So this is actually much better compared to our previous result!

We did also do an Bayesian Neural Network implementation on this. Will be posted soon.

About buttonzzj

Root~
Bookmark the permalink.

Leave a Reply