Toy Neural Network and Recurrent Neural Network Regression with Tensorflow (1)

So this week I'm visiting my wife at Janelia, which gave me some time to play with Tensorflow and see how things work.

I only have a Macbook Pro with me so I can only run Tensorflow on CPU mode. Anyway she gave me a dataset, claiming that it is some kind of neural spiking data and asked me to build two neural networks (a simple NN and a simple RNN).

So the input data is of 10 dimensions. The model need to look back several time steps X_{t-1}, X_{t-2}, ..., X_{t-k} to determine the next output X_t.

Case 1: Simple NN.

Code: here     Data:here

In this case we need to specify k, say k=3. In this case we have to concatenate 3 previous inputs into a vector of dimension k*Input_length as a input vector, then pass this to a hidden layer.

Thus the prepare data function looks like this:

def prepare_data(data, T):
    # returning X[n-1...n-T] as input, X[n] as output
    data_num = data.shape[1]
    data_size = data.shape[0]
    std = np.std(data, axis = 1)
    mean = np.mean(data, axis = 1)
    # first we need to normalize data (-1 to 1, demean and denormalization)
    for i in range(data_num):
        for j in range(data_size):
            data[j,i] = (data[j,i]-mean[j])/std[j]
    
    data_num = data_num - T  # we need to start at X[T] to look back T steps 
    input_size = T*data.shape[0]
    output_size = data.shape[0]
    all_input = np.zeros((input_size, data_num))
    all_output = np.zeros((output_size, data_num))
    
    for i in range(data_num):
        all_output[:,i] = data[:,i+T]
        for j in range(T):
            all_input[j*data.shape[0] : (j+1)*data.shape[0], i] = data[:, i+T-j-1]
    
    # five fold cross-validation
    order = np.random.permutation(data_num)
    training_num = int(data_num*4/5)
    testing_num = data_num - training_num
    training_order = order[0:training_num]
    testing_order = order[training_num:data_num]
    
    training_input = all_input[:, training_order]
    training_output = all_output[:, training_order]
    
    testing_input = all_input[:, testing_order]
    testing_output = all_output[:, testing_order]
     
    return training_input.transpose(), training_output.transpose(), testing_input.transpose(), testing_output.transpose()

The model is defined as the following snippet. Note that sigmoid is not the best choice but I was just playing with tensorflow so get it working was my main goal.

[snipped id="20"]

After this we can grab the training data, process the data and start training using the following code, note that we are doing regression so the loss function is a L2 sum of the difference between the anticipated output and the actual output:

trX, trY, teX, teY = prepare_data(data, T)

X = tf.placeholder("float", [None, trX.shape[1]])
Y = tf.placeholder("float", [None, trY.shape[1]])

hidden_neuron_num = 100
w_h = init_weights([trX.shape[1], hidden_neuron_num])  # 50 hidden neurons
w_o = init_weights([hidden_neuron_num, trY.shape[1]])

py_x = model(X, w_h, w_o)

cost = tf.reduce_sum(tf.pow(Y-py_x, 2))/(2 * trX.shape[0])
train_op = tf.train.GradientDescentOptimizer(0.05).minimize(cost)

init = tf.initialize_all_variables()

Start training:

training_epochs = 2000
display_step = 50
batch_size = 128
with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(training_epochs):
        for i in range(int(trX.shape[0]/batch_size)):
            start_ind = i*batch_size
            end_ind = min((i+1)*batch_size, trX.shape[0])
            sess.run(train_op, feed_dict={X: trX[start_ind:end_ind,:].reshape(batch_size,trX.shape[1]), Y:trY[start_ind:end_ind,:].reshape(batch_size, trY.shape[1])})
           
        if epoch % display_step == 0:
            print "Epoch", '%04d' % (epoch+1), "cost=", "{:.9f}".format(sess.run(cost, feed_dict={X: trX, Y:trY}))
        
    print "Optimizer finished"
    training_cost = sess.run(cost, feed_dict={X:trX, Y:trY})
    print "Training cost= ", training_cost
    
    
    print "Testing...(L2 Loss Comparison)"
    testing_cost = sess.run(tf.reduce_sum(tf.pow(Y-py_x, 2))/(2*teX.shape[0]),
                            feed_dict={X: teX, Y: teY}) #same function as cost above
    print "Testing cost=", testing_cost
    print "Absolute L2 loss difference: ", abs(training_cost - testing_cost)

The end result is:

Training cost=  0.480179
Testing...(L2 Loss Comparison)
Testing cost= 0.517389
Absolute L2 loss difference:  0.037210

We will talk about how to train an RNN in the next post.

About buttonzzj

Root~
Bookmark the permalink.

Leave a Reply