This example demonstrates building a simple dense neural network using Keras. The example uses Agaricus Lepiota training data to detect poisonous mushrooms.
In [1]:
from pandas import read_csv
srooms_df = read_csv('../data/agaricus-lepiota.data.csv')
srooms_df.head()
Out[1]:
If we wanted to use all the features in the training set then we would need to map each out. The LabelEncoder converts T/F data to 1 and 0. The LabelBinarizer converts categorical data to one hot encoding.
If we wanted to use all the features in the training set then we would need to map each out:
column_names = srooms_df.axes[1]
def get_mapping(name):
if(name == 'edibility' or name == 'gill-attachment'):
return (name, sklearn.preprocessing.LabelEncoder())
else:
return (name, sklearn.preprocessing.LabelBinarizer())
mappings = list(map(lambda name: get_mapping(name), column_names)
We will use a subset of features to make it interesting. Are there simple rules or a handful of features that can be used to test edibility? Lets try a few.
In [2]:
from sklearn_pandas import DataFrameMapper
import sklearn
import numpy as np
mappings = ([
('edibility', sklearn.preprocessing.LabelEncoder()),
('odor', sklearn.preprocessing.LabelBinarizer()),
('habitat', sklearn.preprocessing.LabelBinarizer()),
('spore-print-color', sklearn.preprocessing.LabelBinarizer())
])
In [3]:
mapper = DataFrameMapper(mappings)
srooms_np = mapper.fit_transform(srooms_df.copy())
Now lets transform the textual data to a vector...
The transformed data should have 26 features. The break down is as follows:
[almond=a, creosote=c, foul=f, anise=l, musty=m, none=n, pungent=p, spicy=s, fishy=y][woods=d, grasses=g, leaves=l, meadows=m, paths=p, urban=u, waste=w][buff=b, chocolate=h, black=k, brown=n, orange=o, green=r, purple=u, white=w, yellow=y]
In [4]:
print(srooms_np.shape)
print("Frist sample: {}".format(srooms_np[0]))
print(" edibility (poisonous): {}".format(srooms_np[0][0]))
print(" ordr (pungent): {}".format(srooms_np[0][1:10]))
print(" habitat (urban): {}".format(srooms_np[0][10:17]))
print(" spore-print-color (black): {}".format(srooms_np[0][17:]))
Before we train the neural network, let's split the data into training and test datasets.
In [5]:
from sklearn.model_selection import train_test_split
train, test = train_test_split(srooms_np, test_size = 0.2, random_state=7)
train_labels = train[:,0:1]
train_data = train[:,1:]
test_labels = test[:,0:1]
test_data = test[:,1:]
print('training data dims: {}, label dims: {}'.format(train_data.shape,train_labels.shape))
print('test data dims: {}, label dims: {}'.format(test_data.shape,test_labels.shape))
We will create a simple three layer neural network. The network contains two dense layers and a dropout layer (to avoid overfitting).
A dense layer applies an activation function to the output of $W \cdot x + b$. If the dense layer only had three inputs and outputs, then the dense layer looks like this...
Under the covers, keras represents the layer's weights as a matrix. The inputs, outputs, and biases are vectors...
$$ \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix} = relu \begin{pmatrix} \begin{bmatrix} W_{1,1} & W_{1,2} & W_{1,3} \\ W_{2,1} & W_{2,2} & W_{2,3} \\ W_{3,1} & W_{3,2} & W_{3,3} \end{bmatrix} \cdot \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + \begin{bmatrix} b_1 \\ b_2 \\ b_3 \end{bmatrix} \end{pmatrix} $$If this operation was decomposed futher, it would look like this...
$$ \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix} = \begin{bmatrix} relu(W_{1,1} x_1 + W_{1,2} x_2 + W_{1,3} x_3 + b_1) \\ relu(W_{2,1} x_1 + W_{2,2} x_2 + W_{2,3} x_3 + b_2) \\ relu(W_{3,1} x_1 + W_{3,2} x_2 + W_{3,3} x_3 + b_3) \end{bmatrix} $$The Rectified Linear Unit (RELU) function looks like this...
In [6]:
from keras.models import Sequential
from keras.layers import Dense, Dropout
model = Sequential()
model.add(Dense(20, activation='relu', input_dim=25))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.summary()
In [7]:
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
Keras provides callbacks as a means to instrument internal state. In this example, we will write a tensorflow event log. The event log enables a tensorboard visualization of the translated model. The event log also captures key metrics during training.
Note: This step is completely optional and depends on the backend engine.
In [8]:
from keras.callbacks import TensorBoard
tensor_board = TensorBoard(log_dir='./logs/keras_srooms', histogram_freq=1)
In [9]:
model.fit(train_data, train_labels, epochs=10, batch_size=32, callbacks=[tensor_board])
Out[9]:
In [10]:
score = model.evaluate(test_data, test_labels, batch_size=1625)
print(score)
In [11]:
print(model.to_yaml())
definition = model.to_yaml()
We also need to save the parameters or weights learns from training.
In [12]:
model.save_weights('/tmp/srmooms.hdf5')
In [13]:
from keras.models import model_from_yaml
new_model = model_from_yaml(definition)
new_model.load_weights('/tmp/srmooms.hdf5')
Lets run some predictions on the newly initiated model.
In [14]:
predictions = new_model.predict(test_data[0:25]).round()
for i in range(25):
if predictions[i]:
print('Test sample {} is poisonous.'.format(i))
In [15]:
predictions = new_model.predict(test_data).round()
labels = test_labels[:,0]
In [16]:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(labels,predictions)
In [17]:
import matplotlib.pyplot as plt
import itertools
def plot_confusion_matrix(cm, classes,
normalize=False,
title='Confusion matrix',
cmap=plt.cm.Blues):
"""
This function prints and plots the confusion matrix.
Normalization can be applied by setting `normalize=True`.
"""
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=45)
plt.yticks(tick_marks, classes)
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
print("Normalized confusion matrix")
else:
print('Confusion matrix, without normalization')
print(cm)
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, cm[i, j],
horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
In [18]:
plot_confusion_matrix(cm,['edible','poisonous'])
plt.show()
In [ ]: