In this notebook a deep neural network is implemented using TensorFlow. The network is used to train an imitation learning agent, following the CS 294 course. The learning agent's task is to imitate expert's policy.
The structure of the network is the following:
INPUT -> FC -> ReLU -> DROPOUT -> FC -> ReLU -> OUTPUT -> L2 LOSS.
Apart from the DNN implementation, this notebook demonstrates methods for loading saved models and accessing desired tensors.
We start with the implementation of the network.
In [1]:
# Used to clear up the workspace.
%reset -f
import numpy as np
import pickle
import tensorflow as tf
from sklearn.model_selection import train_test_split
# Load the data.
data = pickle.load(open('../data/data-ant.pkl', 'rb'))
actions = data['actions']
observations = data['observations']
X_train, X_test, y_train, y_test = train_test_split(observations, actions, test_size=0.20, random_state=42)
num_train = X_train.shape[0]
num_test = X_test.shape[0]
In [2]:
# Parameters.
model_path = "./model"
learning_rate = 0.01
training_epochs = 10
batch_size = 1000
display_step = 1
reg = 0.001
dropout_prob = 1.0 # No dropout will be performed
# Network parameters.
num_hidden_1 = 128
num_hidden_2 = 128
num_inputs = observations.shape[1]
num_outputs = actions.shape[1]
In [3]:
# Tensors.
X = tf.placeholder(tf.float32, shape=[None, num_inputs], name='X')
y = tf.placeholder(tf.float32, shape=[None, num_outputs], name='y')
keep_prob = tf.placeholder(tf.float32, name='keep_prob')
# Define weights biases.
weights = {
'W1': tf.Variable(tf.random_normal([num_inputs, num_hidden_1], stddev=np.sqrt(2.0 / num_inputs))),
'W2': tf.Variable(tf.random_normal([num_hidden_1, num_hidden_2], stddev=np.sqrt(2.0 / num_hidden_1))),
'W3': tf.Variable(tf.random_normal([num_hidden_2, num_outputs], stddev=np.sqrt(2.0 / num_hidden_2)))
}
biases = {
'b1': tf.Variable(tf.random_normal([num_hidden_1])),
'b2': tf.Variable(tf.random_normal([num_hidden_2])),
'b3': tf.Variable(tf.random_normal([num_outputs]))
}
In [4]:
# Create the model.
def two_layer_nn(X, weights, biases):
# Hidden layer with ReLU activations.
layer_1 = tf.add(tf.matmul(X, weights['W1']), biases['b1'])
layer_1 = tf.nn.relu(layer_1)
# Add dropout.
layer_1_drop = tf.nn.dropout(layer_1, keep_prob)
# Hidden layer with ReLU activations.
layer_2 = tf.add(tf.matmul(layer_1_drop, weights['W2']), biases['b2'])
layer_2 = tf.nn.relu(layer_2)
# Output layer with linear activation.
out_layer = tf.add(tf.matmul(layer_2, weights['W3']), biases['b3'], name='pred')
return out_layer
In [5]:
# Construct the model.
pred = two_layer_nn(X, weights, biases)
# Define the loss and the optimizer.
data_cost = tf.reduce_mean(tf.reduce_sum(tf.square(y - pred), axis=1)) / 2
reg_cost = reg * (tf.nn.l2_loss(weights['W1']) + tf.nn.l2_loss(weights['W2']) + tf.nn.l2_loss(weights['W3']))
cost = tf.add(data_cost, reg_cost, name='cost')
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# Initialize the variables.
init = tf.global_variables_initializer()
saver = tf.train.Saver() # Used to save the session.
In [6]:
# Launch the graph.
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
avg_cost = 0.
# Loop over all batches.
total_batch = num_train / batch_size
for i in range(total_batch):
curr_id, curr_batch = i * batch_size, batch_size
if i == total_batch - 1:
curr_batch += num_train % batch_size
batch_x, batch_y = X_train[curr_id : curr_id + curr_batch], y_train[curr_id : curr_id + curr_batch]
_, c = sess.run([optimizer, cost], feed_dict={X: batch_x, y: batch_y, keep_prob: dropout_prob})
avg_cost += c * curr_batch / num_train
if epoch % display_step == 0:
print "Epoch:%4d," % (epoch+1), "cost={:.9f}".format(avg_cost)
print "Optimization finished!\n"
# Save the model.
save_path = saver.save(sess, model_path)
# Test the model.
training_cost = sess.run(cost, feed_dict={X: X_train, y: y_train, keep_prob: 1.0})
test_cost = sess.run(cost, feed_dict={X: X_test, y: y_test, keep_prob: 1.0})
print "Training cost=%f, test cost=%f" % (training_cost, test_cost)
The model is stored in the model_filename, which is, in our case, in the same directory under the name model.
Now we would like to load the model from a different program and use it to form our predictions.
In order to make it easier, it is advised to label the tensors we would like to use. In our use case, we simply want to validate the model or make predictions, so we only need:
But, as we will soon see, we also need the input tensors:
Looking at the code above, we can observe that the variables are named in a sensible way. We will now proceed to using the trained model.
In [7]:
%reset -f
import numpy as np
import pickle
import tensorflow as tf
# Load the data.
data = pickle.load(open('../data/data-ant.pkl', 'rb'))
num_data = data['actions'].shape[0]
# Parameters.
model_path = "./model"
# Take a random sample for validation/prediction.
num_test = 2
idx = np.random.randint(num_data, size=num_test)
X = data['observations'][idx]
y = data['actions'][idx]
In [8]:
# Load the previous graph.
loader = tf.train.import_meta_graph(model_path + ".meta")
# Handle of the loaded graph.
g = tf.get_default_graph()
cost = g.get_tensor_by_name("cost:0")
with tf.Session() as sess:
loader.restore(sess, model_path) # Restores all the calculated variables, e.g. weights.
test_cost = sess.run(cost, feed_dict={'X:0': X, 'y:0': y, 'keep_prob:0': 1.0})
print "Test cost=%f" % test_cost
y_ = sess.run('pred:0', feed_dict={'X:0': X, 'keep_prob:0': 1.0})
print y
print y_
Notice the lines above:
1. cost = g.get_tensor_by_name("cost:0")
2. sess.run(cost, feed_dict={'X:0': X, 'y:0': y, 'keep_prob:0': 1.0})
3. sess.run('pred:0', feed_dict={'X:0': X, 'keep_prob:0': 1.0})
In practice, the first two lines achieve the same thing as the third line.
To access a tensor from the previous session, we can either make a reference to it, if we need to perform future manipulations, or we can pass it by name in case we simply want to evaluate an expression.
There is another useful way to do this - by using collections. To do this, we first add the variables we are interested in to collections. We do this during model construction. Later on, we can access the variables from collections. Here is a demonstration:
In [9]:
def example_code():
# While building the model.
tf.add_to_collection('reuse', cost)
tf.add_to_collection('reuse', pred)
# To retreive the tensor.
pred = tf.get_collection('cost')[1]
# To print the contents of a collection.
print tf.get_collection('cost')
In [10]:
def example_code():
print(tf.get_default_graph().as_graph_def());
should give us the names and links between tensors. For more complicated graphs, the output becomes too verbose and complicated to understand, so we turn to TensorBoard - a tool for visualizing basically anything within the training process.
A bare minimum example of that is shown below.
In [11]:
%reset -f
import numpy as np
import tensorflow as tf
# Parameters.
model_path = "./model"
# Load the previous graph.
loader = tf.train.import_meta_graph(model_path + ".meta")
# Launch the graph.
with tf.Session() as sess:
loader.restore(sess, model_path) # Restores all the calculated variables, e.g. weights.
# Write the graph to /tmp/test.
writer = tf.summary.FileWriter("/tmp/test", graph=tf.get_default_graph())
The graph is now saved to folder "/tmp/test". To show the graph open up a terminal and run
tensorboard --logdir=/tmp/test
The output should be something like
Starting TensorBoard 46 at http://0.0.0.0:6006
Simply follow the given location from your browser and you should see TensorBoard. Navigate to the GRAPHS section and voilà!
Admittedly, the graph is rather big, and it takes some time to figure out what is happening there, but since we are usually interested in either the cost or final output, we can usually navigate to the end of the graph. In this case we find something like this
We can see both cost and pred tensors are here.