# From linear to non-linear models with MNIST

Introduction

In this practical we will experiment further with linear and non-linear models using the MNIST dataset. MNIST consists of images of handwritten digits that we want to classify correctly.

Learning objectives:

• Implement a linear classifier on the MNIST image data set in Tensorflow.
• Modify the code to to make the classifier non-linear by introducing a hidden non-linear layer.

What is expected of you:

• Step through the code and make sure you understand each step. What test set accuracy do you get? 90%
• Modify the code to make the classifier non-linear by adding a non-linear activation function layer in Tensorflow. What accuracy do you get now? 92%

Some parts of the code were adapted from the DL Indaba practicals.

``````

In [ ]:

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data

def display_mnist_images(gens, num_images):
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'
fig, axs = plt.subplots(1, num_images, figsize=(25, 3))
for i in range(num_images):
reshaped_img = (gens[i].reshape(28, 28) * 255).astype(np.uint8)
axs.flat[i].imshow(reshaped_img)
plt.show()

# visualize random MNIST images #
batch_xs, batch_ys = mnist.train.next_batch(10)
list_of_images = np.split(batch_xs, 10)
display_mnist_images(list_of_images, 10)

x_dim, train_examples, n_classes = mnist.train.images.shape[1], mnist.train.num_examples, mnist.train.labels.shape[1]

######################################
# define the model (build the graph) #
######################################
num_nodes = 50
x = tf.placeholder(tf.float32, [None, x_dim])
W = tf.Variable(tf.random_normal([x_dim, num_nodes]))
b = tf.Variable(tf.ones([num_nodes]))
y = tf.placeholder(tf.float32, [None, n_classes])
activations_hidden = tf.nn.relu(y_)

#Output Layer
W_O = tf.Variable(tf.random_normal([num_nodes, n_classes]))
b_O = tf.Variable(tf.ones([n_classes]))

prob = tf.nn.softmax(y_O)

########################
# define loss function #
########################

cross_entropy_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_O, labels=y))

learning_rate = 0.01

###########################
# define model evaluation #
###########################

actual_class, predicted_class = tf.argmax(y, 1), tf.argmax(prob, 1)
correct_prediction = tf.cast(tf.equal(predicted_class, actual_class), tf.float32)
classification_accuracy = tf.reduce_mean(correct_prediction)

#########################
# define training cycle #
#########################

num_epochs = 50
batch_size = 20

# initializing the variables before starting the session #
init = tf.global_variables_initializer()

# launch the graph in a session (use the session as a context manager) #
with tf.Session() as sess:
# run session #
sess.run(init)
# start main training cycle #
for epoch in range(num_epochs):
avg_cost = 0.
avg_acc = 0.
total_batch = int(mnist.train.num_examples / batch_size)
# loop over all batches #
for i in range(total_batch):
batch_x, batch_y = mnist.train.next_batch(batch_size)
# run optimization op (backprop), cost op and accuracy op (to get training losses) #
_, c, a = sess.run([train_step, cross_entropy_loss, classification_accuracy], feed_dict={x: batch_x, y: batch_y})
# compute avg training loss and avg training accuracy #
avg_cost += c / total_batch
avg_acc += a / total_batch
# display logs per epoch step #
if epoch % 1 == 0:
print("Epoch {}: cross-entropy-loss = {:.4f}, training-accuracy = {:.3f}%".format(epoch + 1, avg_cost, avg_acc * 100))
print("Optimization Finished!")
# calculate test set accuracy #
test_accuracy = classification_accuracy.eval({x: mnist.test.images, y: mnist.test.labels})
print("Accuracy on test set = {:.3f}%".format(test_accuracy * 100))

``````
``````

In [ ]:

``````
``````

In [ ]:

``````