😓Be well prepared that when the code worked for me, may not work for you any more. It took me so much time tonight to debug, upgrade/install packages, change deprecated functions or just ignore warnings.... All because of the frequent changes in these open source packages. So, when it's your turn to try the code, who knows whether it still works...
💝However, when you are seeing my code, you are lucky! At least I took the note on those things need to care about, including the solutions.
❣️Also note, the model evaluation here I didn't evauate all the testing data, because of the labeling time for all those testing image can be huge and I'm really busy. However, you can pay attention to those val_acc and val_loss, lower the better
Get data from here: https://datahack.analyticsvidhya.com/contest/practice-problem-identify-the-digits/
In [12]:
%matplotlib inline
import os
import numpy as np
import pandas as pd
from imageio import imread
from sklearn.metrics import accuracy_score
import pylab
import tensorflow as tf
import keras
You may got an error saying cannot import module "weakref". This problem was not exist before but just appeared... Here's my solution:
pip show tensorflowfrom backports import weakref to import weakreffinalize() function, this is for garbage collection, but finalize function does not exist in weakref in my case.... 😓
In [13]:
seed = 10
rng = np.random.RandomState(seed)
In [14]:
train = pd.read_csv('digit_recognition/train.csv')
test = pd.read_csv('digit_recognition/test.csv')
train.head()
Out[14]:
In [15]:
img_name = rng.choice(train.filename)
training_image_path = 'digit_recognition/Images/train/' + img_name
training_img = imread(training_image_path, as_gray=True)
pylab.imshow(training_img, cmap='gray')
pylab.axis('off')
pylab.show()
In [7]:
training_img[7:9]
Out[7]:
In [16]:
# store all images as numpy arrays, to make data manipulation easier
temp = []
for img_name in train.filename:
training_image_path = 'digit_recognition/Images/train/' + img_name
training_img = imread(training_image_path, as_gray=True)
img = training_img.astype('float32')
temp.append(img)
train_x = np.stack(temp)
train_x /= 255.0
train_x = train_x.reshape(-1, 784).astype('float32')
temp = []
for img_name in test.filename:
testing_image_path = 'digit_recognition/Images/test/' + img_name
testing_img = imread(testing_image_path, as_gray=True)
img = testing_img.astype('float32')
temp.append(img)
test_x = np.stack(temp)
test_x /= 255.0
test_x = test_x.reshape(-1, 784).astype('float32')
In [18]:
train_x
Out[18]:
In [19]:
train_y = keras.utils.np_utils.to_categorical(train.label.values)
In [20]:
# split into training and validation sets, 7:3
split_size = int(train_x.shape[0]*0.7)
train_x, val_x = train_x[:split_size], train_x[split_size:]
train_y, val_y = train_y[:split_size], train_y[split_size:]
In [21]:
train.label.iloc[split_size:split_size+2]
Out[21]:
In [22]:
from keras.models import Sequential
from keras.layers import Dense
# define variables
input_num_units = 784
hidden1_num_units = 500
hidden2_num_units = 500
hidden3_num_units = 500
hidden4_num_units = 500
hidden5_num_units = 500
output_num_units = 10
epochs = 10
batch_size = 128
Keras updated to 2.0
Without updating keras, the way you used Dense() function may keep giving warnings
sudo pip install --upgrade keras==2.1.3. Has to be keras 2.1.3, if it's higher, softmax may get an error below.... (this is why I hate deep learning when you have to use open source!)
In [78]:
# Method 1 - Without Regularization
import warnings
warnings.filterwarnings('ignore')
model = Sequential()
model.add(Dense(output_dim=hidden1_num_units, input_dim=input_num_units, activation='relu'))
model.add(Dense(output_dim=hidden1_num_units, input_dim=input_num_units, activation='relu'))
model.add(Dense(output_dim=hidden2_num_units, input_dim=hidden1_num_units, activation='relu'))
model.add(Dense(output_dim=hidden3_num_units, input_dim=hidden2_num_units, activation='relu'))
model.add(Dense(output_dim=hidden4_num_units, input_dim=hidden3_num_units, activation='relu'))
model.add(Dense(output_dim=hidden5_num_units, input_dim=hidden4_num_units, activation='relu'))
model.add(Dense(output_dim=output_num_units, input_dim=hidden5_num_units, activation='softmax'))
In [79]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
trained_model_5d = model.fit(train_x, train_y, nb_epoch=epochs, batch_size=batch_size, validation_data=(val_x, val_y))
In [66]:
# one sample evaluation
pred = model.predict_classes(test_x)
img_name = rng.choice(test.filename)
testing_image_path = 'digit_recognition/Images/test/' + img_name
testing_img = imread(testing_image_path, as_gray=True)
test_index = int(img_name.split('.')[0]) - train.shape[0]
print "Prediction is: ", pred[test_index]
pylab.imshow(testing_img, cmap='gray')
pylab.axis('off')
pylab.show()
In [68]:
from keras import regularizers
In [80]:
# Method 2 - With L2 regularizer
model = Sequential([
Dense(output_dim=hidden1_num_units, input_dim=input_num_units, activation='relu',
kernel_regularizer=regularizers.l2(0.0001)), # lambda = 0.0001
Dense(output_dim=hidden2_num_units, input_dim=hidden1_num_units, activation='relu',
kernel_regularizer=regularizers.l2(0.0001)),
Dense(output_dim=hidden3_num_units, input_dim=hidden2_num_units, activation='relu',
kernel_regularizer=regularizers.l2(0.0001)),
Dense(output_dim=hidden4_num_units, input_dim=hidden3_num_units, activation='relu',
kernel_regularizer=regularizers.l2(0.0001)),
Dense(output_dim=hidden5_num_units, input_dim=hidden4_num_units, activation='relu',
kernel_regularizer=regularizers.l2(0.0001)),
Dense(output_dim=output_num_units, input_dim=hidden5_num_units, activation='softmax'),
])
In [81]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
trained_model_5d = model.fit(train_x, train_y, nb_epoch=epochs, batch_size=batch_size, validation_data=(val_x, val_y))
In [82]:
# Method 3 - L1 Regularizer
model = Sequential([
Dense(output_dim=hidden1_num_units, input_dim=input_num_units, activation='relu',
kernel_regularizer=regularizers.l1(0.0001)),
Dense(output_dim=hidden2_num_units, input_dim=hidden1_num_units, activation='relu',
kernel_regularizer=regularizers.l1(0.0001)),
Dense(output_dim=hidden3_num_units, input_dim=hidden2_num_units, activation='relu',
kernel_regularizer=regularizers.l1(0.0001)),
Dense(output_dim=hidden4_num_units, input_dim=hidden3_num_units, activation='relu',
kernel_regularizer=regularizers.l1(0.0001)),
Dense(output_dim=hidden5_num_units, input_dim=hidden4_num_units, activation='relu',
kernel_regularizer=regularizers.l1(0.0001)),
Dense(output_dim=output_num_units, input_dim=hidden5_num_units, activation='softmax'),
])
In [83]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
trained_model_5d = model.fit(train_x, train_y, nb_epoch=epochs, batch_size=batch_size, validation_data=(val_x, val_y))
In [84]:
# method 4 - Dropout
from keras.layers.core import Dropout
model = Sequential([
Dense(output_dim=hidden1_num_units, input_dim=input_num_units, activation='relu'),
Dropout(0.25), # the drop probability is 0.25
Dense(output_dim=hidden2_num_units, input_dim=hidden1_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=hidden3_num_units, input_dim=hidden2_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=hidden4_num_units, input_dim=hidden3_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=hidden5_num_units, input_dim=hidden4_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=output_num_units, input_dim=hidden5_num_units, activation='softmax'),
])
In [85]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
trained_model_5d = model.fit(train_x, train_y, nb_epoch=epochs, batch_size=batch_size, validation_data=(val_x, val_y))
In [26]:
# method 5 - early stopping
from keras.callbacks import EarlyStopping
from keras.layers.core import Dropout
import warnings
warnings.filterwarnings('ignore')
model = Sequential([
Dense(output_dim=hidden1_num_units, input_dim=input_num_units, activation='relu'),
Dropout(0.25), # the drop probability is 0.25
Dense(output_dim=hidden2_num_units, input_dim=hidden1_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=hidden3_num_units, input_dim=hidden2_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=hidden4_num_units, input_dim=hidden3_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=hidden5_num_units, input_dim=hidden4_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=output_num_units, input_dim=hidden5_num_units, activation='softmax'),
])
In [27]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
trained_model_5d = model.fit(train_x, train_y, nb_epoch=epochs, batch_size=batch_size, validation_data=(val_x, val_y),
callbacks = [EarlyStopping(monitor='val_acc', patience=2)])
In [9]:
# method 6 - Data Augmentation
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(zca_whitening=True)
# zca_whitening as the argument, will highlight the outline of each digit
In [11]:
train = pd.read_csv('digit_recognition/train.csv')
temp = []
for img_name in train.filename:
training_image_path = 'digit_recognition/Images/train/' + img_name
training_img = imread(training_image_path, as_gray=True)
img = training_img.astype('float32')
temp.append(img)
train_x = np.stack(temp)
# The difference with above starts from here:
train_x = train_x.reshape(train_x.shape[0], 1, 28, 28)
train_x = train_x.astype('float32')
In [ ]:
# fit parameters from data
## fit the training data in order to augment
datagen.fit(train_x) # This will often cause the kernel to die on my machine
# data spliting
split_size = int(train_x.shape[0]*0.7)
train_x, val_x = train_x[:split_size], train_x[split_size:]
train_y, val_y = train_y[:split_size], train_y[split_size:]
# train the model with drop out
model = Sequential([
Dense(output_dim=hidden1_num_units, input_dim=input_num_units, activation='relu'),
Dropout(0.25), # the drop probability is 0.25
Dense(output_dim=hidden2_num_units, input_dim=hidden1_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=hidden3_num_units, input_dim=hidden2_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=hidden4_num_units, input_dim=hidden3_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=hidden5_num_units, input_dim=hidden4_num_units, activation='relu'),
Dropout(0.25),
Dense(output_dim=output_num_units, input_dim=hidden5_num_units, activation='softmax'),
])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
trained_model_5d = model.fit(train_x, train_y, nb_epoch=epochs, batch_size=batch_size, validation_data=(val_x, val_y))
patience, because if we observe each epoch, the val_loss is not simply dropping along the way, it could increase in the middle and then drop again. This is why we need to be careful towards the number of epoch/patience