Now let's build a more sophisticated RNN so that we can look back dynamically without handcrafting the value as we did in the previous post.
At its simplest form, an RNN is defined as:
def RNN(_X, _istate, _weights, _biases): # input shape: (batch_size, n_steps, n_input) _X = tf.transpose(_X, [1, 0, 2]) # permute n_steps and batch_size # Reshape to prepare input to hidden activation _X = tf.reshape(_X, [-1, n_input]) # (n_steps*batch_size, n_input) # Linear activation _X = tf.matmul(_X, _weights['hidden']) + _biases['hidden'] # Define a basic RNN cell with tensorflow basic_rnn_cell = rnn_cell.BasicRNNCell(n_hidden) # Split data because rnn cell needs a list of inputs for the RNN inner loop _X = tf.split(0, n_steps, _X) # n_steps * (batch_size, n_hidden) # Get lstm cell output outputs, states = rnn.rnn(basic_rnn_cell, _X, initial_state=_istate) # Linear activation # Get inner loop last output return tf.matmul(outputs[-1], _weights['out']) + _biases['out']
One thing to notice is that Tensorflow already has a RNNcell defined, so we just need to call the API. (Remember the old days when we have to write our own implementation? This is so much better! )
Of course we need to modify the data preparation code for a little bit. In the case of RNN the maximum number of steps we look back is a variable of our own choice. I set it to be 100. So we feed the network with a input output combination of inputsize * maximum_lookback_step.
# Network Parameters n_input = 10 n_steps = 100 # timesteps n_hidden = 128 # hidden layer num of features n_out = 10 def prepare_data(data, T): # look back T steps max data_num = data.shape data_size = data.shape std = np.std(data, axis = 1) mean = np.mean(data, axis = 1) # first we need to normalize data (-1 to 1, demean and denormalization) for i in range(data_num): for j in range(data_size): data[j,i] = (data[j,i]-mean[j])/std[j] data_num = data_num - T # we need to start at X[T] to look back T steps input_size = (T, data.shape) output_size = data.shape all_input = np.zeros((T, data.shape, data_num)) all_output = np.zeros((output_size, data_num)) for i in range(data_num): all_output[:,i] = data[:,i+T] for j in range(T): all_input[j, :, i] = data[:, i+T-j-1] # five fold cross-validation order = np.random.permutation(data_num) training_num = int(data_num*4/5) testing_num = data_num - training_num training_order = order[0:training_num] testing_order = order[training_num:data_num] training_input = all_input[:, :, training_order] training_output = all_output[:, training_order] testing_input = all_input[:, :, testing_order] testing_output = all_output[:, testing_order] return training_input.transpose((2,0,1)), training_output.transpose(), testing_input.transpose((2,0,1)), testing_output.transpose()
Now you can start training. It is pretty much the same as the previous post so I will not post it here.
The end result looks like this:
Iter 14000, Minibatch Loss= 0.162863 Iter 15000, Minibatch Loss= 0.164219 Iter 16000, Minibatch Loss= 0.181672 Iter 17000, Minibatch Loss= 0.129802 Iter 18000, Minibatch Loss= 0.144143 Iter 19000, Minibatch Loss= 0.157080
Note that we are using the same loss function as the NN implementation. So this is actually much better compared to our previous result!
We did also do an Bayesian Neural Network implementation on this. Will be posted soon.