(C) 2018-2019 by Damir Cavar
Version: 1.1, January 2019
This is a tutorial related to the L665 course on Machine Learning for NLP focusing on Deep Learning, Spring 2018 and 2019 at Indiana University.
This material is based on Jason Brownlee's tutorial Develop Your First Neural Network in Python With Keras Step-By-Step. See for more details and explanations this page. All copyrights are his, except on a few small comments that I added.
Keras is a neural network module that is running on top of TensorFlow (among others). Make sure that you install TensorFlow on your system. Go to the Keras homepage and install the module in Python. This example also requires that Scipy and Numpy are installed in your system.
As explained in the above tutorial, the steps are:
We have to import the necessary modules from Keras:
In [2]:
from keras.models import Sequential
from keras.layers import Dense
We will use numpy as well:
In [2]:
import numpy
In his tutorial, as linked above, Jason Brownlee suggests that we initialize the random number generator with a fixed number to make sure that the results are the same at every run, since the learning algorithm makes use of a stochastic process. We initialize the random number generator with 7:
In [3]:
numpy.random.seed(7)
The data-set suggested in Brownlee's tutorial is Pima Indians Diabetes Data Set. The required file can be downloaded using this link. It is available in the local data subfolder with the .csv filename-ending.
In [4]:
dataset = numpy.loadtxt("data/pima-indians-diabetes.csv", delimiter=",")
The data is organized as follows: the first 8 columns per row define the features, that is the input variables for the neural network. The last column defines the output as a binary value of $0$ or $1$. We can separate those two from the dataset into two variables:
In [5]:
X = dataset[:,0:8]
Y = dataset[:,8]
Just to verify the content:
In [6]:
X
Out[6]:
In [7]:
Y
Out[7]:
We will define our model in the next step. The first layer is the input layer. It is set to have 8 inputs for the 8 variables using the attribute input_dim. The Dense class defines the layers to be fully connected. The number of neurons is specified as the first argument to the initializer. We are choosing also the activation function using the activation attribute. This should be clear from the presentations in class and other examples and discussions on related notebooks here in this collection. The output layer consists of one neuron and uses the sigmoid activation function to return a weight between $0$ and $1$:
In [23]:
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
The defined network needs to be compiled. The compilation process creates a specific implementation of it using the backend (e.g. TensorFlow or Theano), decides whether a GPU or a CPU will be used, which loss and optimization function to select, and which metrics should be collected during training. In this case we use the binary cross-entropy as a loss function, the efficient implementation of a gradient decent algorithm called Adam, and we store the classification accuracy for the output and analysis.
In [24]:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
The training of the model is achieved by calling the fit method. The parameters specify the input matrix and output vector in our case, as well as the number of iterations through the data set for training, called epochs. The batch size specifies the number of instances that are evaluated before an update of the parameters is applied.
In [26]:
model.fit(X, Y, epochs=150, batch_size=4)
Out[26]:
The evaluation is available via the evaluate method. In our case we print out the accuracy:
In [11]:
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
We can now make predictions by calling the predict method with the input matrix as a parameter. In this case we are using the training data to predict the output classifier. This is in general not a good idea. Here it just serves the purpose of showing how the methods are used:
In [12]:
predictions = model.predict(X)
In [13]:
rounded = [round(x[0]) for x in predictions]
print(rounded)
In [ ]:
In [ ]: