Now let's build a more sophisticated RNN so that we can look back dynamically without handcrafting the value as we did in the previous post.

At its simplest form, an RNN is defined as:

def RNN(_X, _istate, _weights, _biases): # input shape: (batch_size, n_steps, n_input) _X = tf.transpose(_X, [1, 0, 2]) # permute n_steps and batch_size # Reshape to prepare input to hidden activation _X = tf.reshape(_X, [-1, n_input]) # (n_steps*batch_size, n_input) # Linear activation _X = tf.matmul(_X, _weights['hidden']) + _biases['hidden'] # Define a basic RNN cell with tensorflow basic_rnn_cell = rnn_cell.BasicRNNCell(n_hidden) # Split data because rnn cell needs a list of inputs for the RNN inner loop _X = tf.split(0, n_steps, _X) # n_steps * (batch_size, n_hidden) # Get lstm cell output outputs, states = rnn.rnn(basic_rnn_cell, _X, initial_state=_istate) # Linear activation # Get inner loop last output return tf.matmul(outputs[-1], _weights['out']) + _biases['out']

One thing to notice is that Tensorflow already has a RNNcell defined, so we just need to call the API. (Remember the old days when we have to write our own implementation? This is so much better! )

Of course we need to modify the data preparation code for a little bit. In the case of RNN the maximum number of steps we look back is a variable of our own choice. I set it to be 100. So we feed the network with a input output combination of inputsize * maximum_lookback_step.

# Network Parameters n_input = 10 n_steps = 100 # timesteps n_hidden = 128 # hidden layer num of features n_out = 10 def prepare_data(data, T): # look back T steps max data_num = data.shape[1] data_size = data.shape[0] std = np.std(data, axis = 1) mean = np.mean(data, axis = 1) # first we need to normalize data (-1 to 1, demean and denormalization) for i in range(data_num): for j in range(data_size): data[j,i] = (data[j,i]-mean[j])/std[j] data_num = data_num - T # we need to start at X[T] to look back T steps input_size = (T, data.shape[0]) output_size = data.shape[0] all_input = np.zeros((T, data.shape[0], data_num)) all_output = np.zeros((output_size, data_num)) for i in range(data_num): all_output[:,i] = data[:,i+T] for j in range(T): all_input[j, :, i] = data[:, i+T-j-1] # five fold cross-validation order = np.random.permutation(data_num) training_num = int(data_num*4/5) testing_num = data_num - training_num training_order = order[0:training_num] testing_order = order[training_num:data_num] training_input = all_input[:, :, training_order] training_output = all_output[:, training_order] testing_input = all_input[:, :, testing_order] testing_output = all_output[:, testing_order] return training_input.transpose((2,0,1)), training_output.transpose(), testing_input.transpose((2,0,1)), testing_output.transpose()

Now you can start training. It is pretty much the same as the previous post so I will not post it here.

The end result looks like this:

Iter 14000, Minibatch Loss= 0.162863 Iter 15000, Minibatch Loss= 0.164219 Iter 16000, Minibatch Loss= 0.181672 Iter 17000, Minibatch Loss= 0.129802 Iter 18000, Minibatch Loss= 0.144143 Iter 19000, Minibatch Loss= 0.157080

Note that we are using the same loss function as the NN implementation. So this is actually much better compared to our previous result!

We did also do an Bayesian Neural Network implementation on this. Will be posted soon.

]]>I only have a Macbook Pro with me so I can only run Tensorflow on CPU mode. Anyway she gave me a dataset, claiming that it is some kind of neural spiking data and asked me to build two neural networks (a simple NN and a simple RNN).

So the input data is of 10 dimensions. The model need to look back several time steps to determine the next output .

**Case 1: Simple NN.**

In this case we need to specify , say . In this case we have to concatenate 3 previous inputs into a vector of dimension k*Input_length as a input vector, then pass this to a hidden layer.

Thus the prepare data function looks like this:

def prepare_data(data, T): # returning X[n-1...n-T] as input, X[n] as output data_num = data.shape[1] data_size = data.shape[0] std = np.std(data, axis = 1) mean = np.mean(data, axis = 1) # first we need to normalize data (-1 to 1, demean and denormalization) for i in range(data_num): for j in range(data_size): data[j,i] = (data[j,i]-mean[j])/std[j] data_num = data_num - T # we need to start at X[T] to look back T steps input_size = T*data.shape[0] output_size = data.shape[0] all_input = np.zeros((input_size, data_num)) all_output = np.zeros((output_size, data_num)) for i in range(data_num): all_output[:,i] = data[:,i+T] for j in range(T): all_input[j*data.shape[0] : (j+1)*data.shape[0], i] = data[:, i+T-j-1] # five fold cross-validation order = np.random.permutation(data_num) training_num = int(data_num*4/5) testing_num = data_num - training_num training_order = order[0:training_num] testing_order = order[training_num:data_num] training_input = all_input[:, training_order] training_output = all_output[:, training_order] testing_input = all_input[:, testing_order] testing_output = all_output[:, testing_order] return training_input.transpose(), training_output.transpose(), testing_input.transpose(), testing_output.transpose()

The model is defined as the following snippet. Note that sigmoid is not the best choice but I was just playing with tensorflow so get it working was my main goal.

[snipped id="20"]

After this we can grab the training data, process the data and start training using the following code, note that we are doing regression so the loss function is a L2 sum of the difference between the anticipated output and the actual output:

trX, trY, teX, teY = prepare_data(data, T) X = tf.placeholder("float", [None, trX.shape[1]]) Y = tf.placeholder("float", [None, trY.shape[1]]) hidden_neuron_num = 100 w_h = init_weights([trX.shape[1], hidden_neuron_num]) # 50 hidden neurons w_o = init_weights([hidden_neuron_num, trY.shape[1]]) py_x = model(X, w_h, w_o) cost = tf.reduce_sum(tf.pow(Y-py_x, 2))/(2 * trX.shape[0]) train_op = tf.train.GradientDescentOptimizer(0.05).minimize(cost) init = tf.initialize_all_variables()

Start training:

training_epochs = 2000 display_step = 50 batch_size = 128 with tf.Session() as sess: sess.run(init) for epoch in range(training_epochs): for i in range(int(trX.shape[0]/batch_size)): start_ind = i*batch_size end_ind = min((i+1)*batch_size, trX.shape[0]) sess.run(train_op, feed_dict={X: trX[start_ind:end_ind,:].reshape(batch_size,trX.shape[1]), Y:trY[start_ind:end_ind,:].reshape(batch_size, trY.shape[1])}) if epoch % display_step == 0: print "Epoch", '%04d' % (epoch+1), "cost=", "{:.9f}".format(sess.run(cost, feed_dict={X: trX, Y:trY})) print "Optimizer finished" training_cost = sess.run(cost, feed_dict={X:trX, Y:trY}) print "Training cost= ", training_cost print "Testing...(L2 Loss Comparison)" testing_cost = sess.run(tf.reduce_sum(tf.pow(Y-py_x, 2))/(2*teX.shape[0]), feed_dict={X: teX, Y: teY}) #same function as cost above print "Testing cost=", testing_cost print "Absolute L2 loss difference: ", abs(training_cost - testing_cost)

The end result is:

Training cost= 0.480179 Testing...(L2 Loss Comparison) Testing cost= 0.517389 Absolute L2 loss difference: 0.037210

We will talk about how to train an RNN in the next post.

]]>The motivation is that the network needs to perform on-line learning with no cloud back end (to support real-time response).

Although CNN can do real-time forwarding pass with modern libraries like Caffe, but training in real-time on resource limited platform is still too much to ask.

A major advantage of our approach is that we are using competitive hebbian learning so that only a small portion of the network is updated according to the current learning input.

Detail about the experiment can be found in this paper (accepted by CVPRW, 2016).

Video demonstrations are also available on my youtube channel (yes I have an youtube channel :))

General Introduction:

Demo1: Real-time testing

Demo2: Training and testing

Demo3: Blind testing

]]>The agent is able to follow its internal intentions (GPS signal) and adjust its posture.

Training video is available below:

]]>

The source code (using ipython notebook) can be found at this link: HMMonWSJ

In this post we discuss how we train a HMM to write as an WSJ journalist !

Here we train a first-order HMM for POS tagging based on a set of training data. This allows us to estimate probabilities and train the HMM. We will then use the trained model to automatically assign tags to testing data. Finally you will use the provided ground-truth tag assignment for the testing data to evaluate and report your model performance.

From the training data, we need to estimate both emission probabilities and transition probabilities.

- Emission probabilities: The emission probabilities are calculated as , where stands for the word, and stands for tag.
- Transition probabilities: The vocabulary size for all the tags is 45. We implement a function that returns bigrams of tags . A better way of doing this may be using a smoothing technique but we will not discuss this in the current post.
- To calculate the transition probabilities (i.e., bigrams), for each sentence, we add two tags to represent sentence boundaries, i.e. , and

The code that does the counting is as follows: (note that we count the count(w,t) in word_table and count(t) in tag_table)

with open('./wsj1-18.training','r') as f: for line in f: # a new sentence starts current_tag = 'START' old_tag = 'START' current_word = '' tag_table[current_tag] = tag_table[current_tag]+1 tag_flag = 0 # after reading a tag, next is a word for word in line.split(): if tag_flag == 0: # this is a word! current_word = word tag_flag = 1 elif tag_flag == 1: # this is a tag! current_tag = word if current_tag in tag_table: tag_table[current_tag] += 1 else: tag_table[current_tag] = 1 if current_word in word_table: if current_tag in word_table[current_word]: word_table[current_word][current_tag] +=1 else: word_table[current_word][current_tag] = 1 else: word_table[current_word] = {} word_table[current_word][current_tag] = 1 # update tag_transition if old_tag in tag_transition: if current_tag in tag_transition[old_tag]: tag_transition[old_tag][current_tag] +=1 else: tag_transition[old_tag][current_tag] = 1 else: tag_transition[old_tag] = {} tag_transition[old_tag][current_tag] = 1 old_tag = current_tag tag_flag = 0 current_tag = 'END' current_word = '' tag_table[current_tag] = tag_table[current_tag]+1 if old_tag in tag_transition: if current_tag in tag_transition[old_tag]: tag_transition[old_tag][current_tag] +=1 else: tag_transition[old_tag][current_tag] = 1 else: tag_transition[old_tag] = {} tag_transition[old_tag][current_tag] = 1 f.close()

Now there are uncommon words in the training set. Words occurring less than five times in the training data should be mapped to the word token *UNKA*. The following code does this job:

word_table['UNKA'] = {} remove_word_list = [] for current_tag in tag_table: word_table['UNKA'][current_tag] = 0 for current_word, current_tag_list in word_table.iteritems(): sum = 0 for current_tag, c_word_tag in current_tag_list.iteritems(): sum += c_word_tag if sum < 5: # we have an uncommon word for current_tag, c_word_tag in current_tag_list.iteritems(): word_table['UNKA'][current_tag] += c_word_tag remove_word_list.append(current_word) for current_word in remove_word_list: word_table.pop(current_word, None)

Afterwards we can calculate the emission and transition probabilities at ease:

import copy Pwt = copy.deepcopy(word_table) # we want to keep the count in word_table unchanged for current_word, current_tag_list in word_table.iteritems(): for current_tag, c_word_tag in current_tag_list.iteritems(): # c_ stands for count c_tag = tag_table[current_tag] Pwt[current_word][current_tag] = float(c_word_tag)/float(c_tag) Pt_transition = copy.deepcopy(tag_transition) for old_tag, next_tag_list in tag_transition.iteritems(): for next_tag, c_old_current in next_tag_list.iteritems(): c_tag = tag_table[old_tag] Pt_transition[old_tag][next_tag] = float(c_old_current)/float(c_tag)

With the transition and emission probabilities calculated we can use Viterbi algorithm to train the HMM. We will explain Viterbi algorithm in future updates. But for now this is the code:

def viterbi(line, pwt, pt_transition, all_tag): result = {} # result[final_tag: [likelihood, [path]]] current_word = line[-1] #print('current_word', current_word) if line[0]==line[-1]: #print('final') for current_tag in all_tag: a_ij = 0 if current_tag in pt_transition['START']: a_ij = pt_transition['START'][current_tag] b_j = 0 if current_tag in pwt[current_word]: b_j = pwt[current_word][current_tag] likelyhood = b_j * a_ij path = ['START'] + [current_tag] result[current_tag] = [likelyhood, path] else: v_i_result = viterbi(line[0:-1], pwt, pt_transition, all_tag) for current_tag in all_tag: #print('current tag', current_tag) likelyhood = 0 path = [] b_j = 0 if current_tag in pwt[current_word]: b_j = pwt[current_word][current_tag] for previous_tag in all_tag: if previous_tag == 'END': continue a_ij = 0 if current_tag in pt_transition[previous_tag]: a_ij = pt_transition[previous_tag][current_tag] if likelyhood < b_j * v_i_result[previous_tag][0] * a_ij: likelyhood = b_j * v_i_result[previous_tag][0] * a_ij path = v_i_result[previous_tag][1]+[current_tag] result[current_tag] = [likelyhood, path] return result

Thus we have our HMM trained and ready to be tested!

The testing code can be found in the original code.

In short we calculate the confusion matrix C[true_label, recognized_label].

Visualization of the confusion matrix is as follows:

It generates pretty good results. Testing is performed on 5927 sentences.

]]>

source code can be found here: source

]]>Source code can be found at source.

There are basically two ways of doing it : the height field fluid approach and shallow water equation approach. We would talk about the first approach in this post.

A good tutorial on height field fluids can be found here: http://www.matthiasmueller.info/talks/gdc2008.pdf

Rendering is based on the code found following this link: http://www.lousodrome.net/opengl/．

The base code uses simple sine waves to simulate water flows. This post focuses on how we can create more complicated wave simulations based on simple physics law.

Height field fluid considers the interaction force as proportional to the curvature of the height function.

Thus according to newton's law:

f(i,j) = curvature(i,j)

v(i,j) = v_old(i,j) + delta_t * f(i,j)

h(i,j) = h_old(i,j) + delta_t * v(i,j)

This is a simple update function. Implementation can be found in main_height_fluid.c

Code for update the height field:

void update_height(void) { int i,j; for(j = 0; j < RESOLUTION; j++) for(i = 0; i <= RESOLUTION; i++) { float height_east = height[MIN(i+1,RESOLUTION)][j]; float height_west = height[MAX(i-1,0)][j]; float height_south = height[i][MAX(j-1,0)]; float height_north = height[i][MIN(j+1, RESOLUTION-1)]; height_v[i][j] += ((height_west+ height_east + height_south + height_north)/4 - height[i][j]); height_v[i][j] *= damping; height[i][j] += height_v[i][j]; } }

Desktop recording available at:

Shallow water equation is more complicated. Basically it considers the conservation of mass and momentum in two directions. It also takes the velocity of wave in two directions into account.

Code for update the height field using shallow water equation. The code is in Matlab which comes from the reference I posted above. In the source code you would find my C++ /matlab version of this equation solver. :

% Reflective boundary conditions H(:,1) = H(:,2); U(:,1) = U(:,2); V(:,1) = -V(:,2); H(:,n+2) = H(:,n+1); U(:,n+2) = U(:,n+1); V(:,n+2) = -V(:,n+1); H(1,:) = H(2,:); U(1,:) = -U(2,:); V(1,:) = V(2,:); H(n+2,:) = H(n+1,:); U(n+2,:) = -U(n+1,:); V(n+2,:) = V(n+1,:); % First half step % x direction i = 1:n+1; j = 1:n; % height Hx(i,j) = (H(i+1,j+1)+H(i,j+1))/2 - dt/(2*dx)*(U(i+1,j+1)-U(i,j+1)); % x momentum Ux(i,j) = (U(i+1,j+1)+U(i,j+1))/2 - ... dt/(2*dx)*((U(i+1,j+1).^2./H(i+1,j+1) + g/2*H(i+1,j+1).^2) - ... (U(i,j+1).^2./H(i,j+1) + g/2*H(i,j+1).^2)); % y momentum Vx(i,j) = (V(i+1,j+1)+V(i,j+1))/2 - ... dt/(2*dx)*((U(i+1,j+1).*V(i+1,j+1)./H(i+1,j+1)) - ... (U(i,j+1).*V(i,j+1)./H(i,j+1))); % y direction i = 1:n; j = 1:n+1; % height Hy(i,j) = (H(i+1,j+1)+H(i+1,j))/2 - dt/(2*dy)*(V(i+1,j+1)-V(i+1,j)); % x momentum Uy(i,j) = (U(i+1,j+1)+U(i+1,j))/2 - ... dt/(2*dy)*((V(i+1,j+1).*U(i+1,j+1)./H(i+1,j+1)) - ... (V(i+1,j).*U(i+1,j)./H(i+1,j))); % y momentum Vy(i,j) = (V(i+1,j+1)+V(i+1,j))/2 - ... dt/(2*dy)*((V(i+1,j+1).^2./H(i+1,j+1) + g/2*H(i+1,j+1).^2) - ... (V(i+1,j).^2./H(i+1,j) + g/2*H(i+1,j).^2)); % Second half step i = 2:n+1; j = 2:n+1; % height H(i,j) = H(i,j) - (dt/dx)*(Ux(i,j-1)-Ux(i-1,j-1)) - ... (dt/dy)*(Vy(i-1,j)-Vy(i-1,j-1)); % x momentum U(i,j) = U(i,j) - (dt/dx)*((Ux(i,j-1).^2./Hx(i,j-1) + g/2*Hx(i,j-1).^2) - ... (Ux(i-1,j-1).^2./Hx(i-1,j-1) + g/2*Hx(i-1,j-1).^2)) ... - (dt/dy)*((Vy(i-1,j).*Uy(i-1,j)./Hy(i-1,j)) - ... (Vy(i-1,j-1).*Uy(i-1,j-1)./Hy(i-1,j-1))); % y momentum V(i,j) = V(i,j) - (dt/dx)*((Ux(i,j-1).*Vx(i,j-1)./Hx(i,j-1)) - ... (Ux(i-1,j-1).*Vx(i-1,j-1)./Hx(i-1,j-1))) ... - (dt/dy)*((Vy(i-1,j).^2./Hy(i-1,j) + g/2*Hy(i-1,j).^2) - ... (Vy(i-1,j-1).^2./Hy(i-1,j-1) + g/2*Hy(i-1,j-1).^2));]]>

This post contains the sourcecode and description of how I implemented the spinning cube project.

The source code is available here: cubecollision

To use this sourcecode:

- Open ./Windows/ogldev.sln and build the program.
- Cube is transparent.
- Control keys: 0,1,2,3,4,5,6,7, corresponding to 8 corners of the cube.
- Program referencing to the tutorial at http://ogldev.atspace.co.uk/
- Collision detected by checking vertices coordinates.

For every update, we need to update the location of the 8 vertices of the cube.

The update function:

void Update(){ m_translate.m[0][3] = m_translate.m[0][3] + v_velocity.x * dt; m_translate.m[1][3] = m_translate.m[1][3] + v_velocity.y * dt; m_translate.m[2][3] = m_translate.m[2][3] + v_velocity.z * dt; float angularVelocity = sqrt(v_angularVelocity.x* v_angularVelocity.x + v_angularVelocity.y * v_angularVelocity.y + v_angularVelocity.z * v_angularVelocity.z); float u = v_angularVelocity.x, v = v_angularVelocity.y, w = v_angularVelocity.z; rotate_angle = (angularVelocity*dt + rotate_angle); m_rotate.m[0][0] = (u*u + (v*v + w*w)* cosf(rotate_angle))/(angularVelocity*angularVelocity); m_rotate.m[0][1] = (u*v*(1- cosf(rotate_angle))- w*angularVelocity*sinf(rotate_angle))/(angularVelocity*angularVelocity); m_rotate.m[0][2] = (u*w*(1-cosf(rotate_angle)) + v*angularVelocity*sinf(rotate_angle))/(angularVelocity*angularVelocity); m_rotate.m[1][0] = (u*v*(1-cosf(rotate_angle))+ w*angularVelocity*sinf(rotate_angle))/(angularVelocity*angularVelocity); m_rotate.m[1][1] = (v*v + (u*u + w*w)*cosf(rotate_angle))/(angularVelocity*angularVelocity); m_rotate.m[1][2] = (v*w*(1-cosf(rotate_angle))-u*angularVelocity*sinf(rotate_angle))/(angularVelocity*angularVelocity); m_rotate.m[2][0] = (u*w*(1-cosf(rotate_angle))-v*angularVelocity*sinf(rotate_angle))/(angularVelocity*angularVelocity); m_rotate.m[2][1] = (v*w*(1-cosf(rotate_angle))+u*angularVelocity*sinf(rotate_angle))/(angularVelocity*angularVelocity); m_rotate.m[2][2] = (w*w + (u*u + v*v)*cosf(rotate_angle))/(angularVelocity*angularVelocity); m_transform = m_translate * m_rotate * m_rotate_old; Check_Collide(); clock_t newTime = clock(); clock_t duration = newTime- Time; delta_time = duration/ (double) CLOCKS_PER_SEC; //if (v_velocity.y > 0.05f || m_translate.m[1][3] >= -0.74) // v_velocity.y = v_velocity.y - 0.1f * delta_time; // g = 0.1 //else // v_velocity.y = 0.f; }

Where we update the rotation matrix, angular velocity and rotation angle. After that we calculate the transformation matrix.

To check whether the inner cube is going to collide with the outer cube, I use the following code:

void Check_Collide(){ ...... for(int i= 0; i <= 7; i++){ Vertices_new.x = Vertices[i].x * m_transform.m[0][0] + Vertices[i].y * m_transform.m[0][1] + Vertices[i].z * m_transform.m[0][2] + 1 * m_transform.m[0][3]; Vertices_new.y = Vertices[i].x * m_transform.m[1][0] + Vertices[i].y * m_transform.m[1][1] + Vertices[i].z * m_transform.m[1][2] + 1 * m_transform.m[1][3]; Vertices_new.z = Vertices[i].x * m_transform.m[2][0] + Vertices[i].y * m_transform.m[2][1] + Vertices[i].z * m_transform.m[2][2] + 1 * m_transform.m[2][3]; if(Vertices_new.x > 1.0f || Vertices_new.x < -1.0f || Vertices_new.y > 1.0f || Vertices_new.y < -1.0f|| Vertices_new.z > 1 || Vertices_new.z < -1){ Vertices_flag = i; break; } } ...... }

What it does is to update the location of the vertices and check the boundary. If collision is detected, then a force in the opposite direction would be added onto that vertices. Angular velocity and velocity would be updated.

If additional force is added via user interaction, then add_force would be invoked:

void addForce(int i){ Vector3f Centroid; Vector3f Vertices_new; Centroid.x = m_transform.m[0][3]; Centroid.y = m_transform.m[1][3]; Centroid.z = m_transform.m[2][3]; Vertices_new.x = Vertices[i].x * m_transform.m[0][0] + Vertices[i].y * m_transform.m[0][1] + Vertices[i].z * m_transform.m[0][2] + 1 * m_transform.m[0][3]; Vertices_new.y = Vertices[i].x * m_transform.m[1][0] + Vertices[i].y * m_transform.m[1][1] + Vertices[i].z * m_transform.m[1][2] + 1 * m_transform.m[1][3]; Vertices_new.z = Vertices[i].x * m_transform.m[2][0] + Vertices[i].y * m_transform.m[2][1] + Vertices[i].z * m_transform.m[2][2] + 1 * m_transform.m[2][3]; v_velocity.x = 10.f * abs(Vertices_new.x - Centroid.x) + v_velocity.x; v_velocity.z = 10.f * abs(Vertices_new.z - Centroid.z) + v_velocity.z; v_velocity.y = 10.f * abs(Vertices_new.y - Centroid.y) + v_velocity.y; }

]]>

To test the real time input, I referred to some existing programs on the web, and successfully hooked my webcam to the program. Then I went mad and command the program to take 600 pictures of me in one hour, with an interval of 6 secs.

The final avi output is under my weibo account. Hope you enjoy it.

]]>Paper : http://www.heikohoffmann.de/htmlthesis/hoffmann_diss.html

]]>