Title: TensorFlow Multi-Classification Neural Network Slug: tensorflow-multi-neural-network Summary: A TensorFlow implementation of a neural network within a multi-class classification problem. Date: 2017-01-17 19:11 Category: Neural Networks Tags: Basics Authors: Thomas Pinder

Despite using a different dataset, the principles and semantics behind decisions such as label encoding and activation functions will not be discussed in this tutorial, however, it has been extensively discussed in a previous blog post demonstrating a multi-classification neural network using Keras here.

This tutorial will be faster paced than the above, using the iris dataset to implement a neural network using TensorFlow, the library on which Keras is built upon. The iris dataset is a very popular dataset within machine learning, featuring four continuous columns and three possible labels, corresponding to plant species.

Load the data


In [206]:
from sklearn import datasets
from sklearn.model_selection import train_test_split

# Load the data
iris = datasets.load_iris()

# Split into features and labels
X = iris.data
y = iris.target

Preprocessing

With the data loaded, we must one-hot encode the the labels and normalise the features.


In [207]:
import numpy as np
from keras.utils import to_categorical

# Normalise features matrix
X_norm = X / np.linalg.norm(X)

# Encode labels array
y_enc = to_categorical(y)

# Split up into train/test sets
X_train, X_test, y_train, y_test = train_test_split(X_norm, y_enc, test_size = 0.3, random_state = 123)
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)


(105, 4)
(45, 4)
(105, 3)
(45, 3)

Modelling

With the data in the desired format, we can now begin to build a neural network. An important first step is to first initiate a TensorFlow interactive session, creating an environment for which TensorFlow can store and use objects.


In [208]:
import tensorflow as tf
sess = tf.InteractiveSession()
seed = 123

With a TensorFlow session in place, the next step in building a neural network is define the layer sizes. In TensorFlow, this is done through the placeholder object.


In [209]:
input_size = X_norm.shape[1]
output_size = y_enc.shape[1]
print("Input Size: {}".format(input_size))
print("Output Size: {}".format(output_size))
X_ = tf.placeholder(tf.float32, shape = [None, 4])
y_ = tf.placeholder(tf.float32, shape = [None, 3])


Input Size: 4
Output Size: 3

One of the main differences between Keras and TensorFlow is lower level explicitness that TensorFlow requires. Keras is a great choice for quickly building a neural network, however, this simplicity comes at a cost, that cost being the reduced level of control that the user has. One early example of this additional control comes here in TensorFlow whereby to initialise weights and biases, the user must explicitly define them before running the model. In this tutorial we'll be making a neural network with just one hidden layer of width four.


In [210]:
hidden_width = 8

# First lets define the weights connecting the two layers
weight_1 = tf.Variable(tf.random_normal(shape=[input_size, hidden_width], seed = seed)) 
weight_2 = tf.Variable(tf.random_normal(shape=[hidden_width, output_size], seed = seed)) 

# Now lets compute the biases in a similar way
bias_1 = tf.Variable(tf.zeros(shape=[hidden_width]))  
bias_2 = tf.Variable(tf.zeros(shape=[output_size]))

We now have the connections between the layers, so the next step in the architecture is to define the layers themselves. As in the previous MNIST tutorial (here), we'll use the RELU and softmax functions as the layer's activation functions.


In [211]:
hidden_layer = tf.nn.relu(tf.add(tf.matmul(X_, weight_1), bias_1))
output_layer = tf.nn.softmax(tf.add(tf.matmul(hidden_layer, weight_2), bias_2))

So far we have data processed, weights and biases initialised and layer structure defined, we just need to tell the network how to learn through the use of cost functions and optimisational procedures. In this tutorial we'll use the mean squared error hear as our cost function and the ADAM optimiser.


In [212]:
cost = tf.reduce_mean(tf.squared_difference(output_layer, y_))
optimiser = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)

Initialising the Network

With the architecture in place, the network can now be initialised within our previously created environment.init =


In [213]:
init = tf.global_variables_initializer()
sess.run(init)

Training

To utilise our cost and optimisation function, the number of epochs must be set and then the network can undergo forward and backward propagation.


In [214]:
epochs = 4000
for i in range(epochs):
    sess.run(optimiser, feed_dict={X_:X_train, y_:y_train})

Testing

With a model now trained, we can get some accuracy metrics from it.


In [215]:
predictions = sess.run(output_layer, feed_dict={X_: X_test})
print(100*np.sum(np.argmax(predictions,1)==np.argmax(y_test,1))/y_test.shape[0])


95.55555555555556

In [ ]: