In [1]:
using Alice
In [2]:
Base.Threads.nthreads()
Out[2]:
The MNIST database is a collection of handwritten digits and labels from a larger collection originally created by NIST (National Institute of Standards and Technology). It contains characters written by American Census bereau employees and high school students.
The common set that everyone uses now for ML testing was created by Yan LeCun (http://yann.lecun.com/exdb/mnist/). And there is a Julia package that makes it easy to access the data.
In [3]:
using MNIST
In [4]:
function display_mnist(; flipink = false)
rows, cols = 16, 32
num_images = rows * cols
selection = rand(1:60000, num_images) # Random selection of image indices
images = Array(Array{Float64, 2}, 0) # Initialise blank outer array to contain image arrays
img = ones(30, 30) # Create blank canvas with some padding
for i in selection
img[2:29, 2:29] = trainfeatures(i) ./ 255 # Reshape each image as a matrix and place in canvas
push!(images, copy(img)) # Push single image into images vector
end
images = hvcat(cols, images...) # Concatenate individual image matrices into a large matrix
# Display image with either white or black ink with opposite background
flipink ? Gray.(1.0 - images) : Gray.(images)
end;
In [5]:
display_mnist()
Out[5]:
In [6]:
display_mnist(flipink = true)
Out[6]:
The original data has two separate sets - 60,000 training and 10,000 test observations. Here we'll train on the first 50,000 in the original training set and use the other 10,000 for cross validation. The test set can stay as it is.
The only pre-processing will be to scale the pixel inputs to the range [0, 1], reshape each image into a 28x28 array (in the provided set each image is a single column), and change the output label 0 to 10.
In [7]:
# Download training images and labels from MNIST package
train_images, train_labels = traindata()
test_images, test_labels = testdata()
# Rescale and reshape
train_images = train_images ./ 255
train_images = reshape(train_images, 28, 28, 60000)
test_images = test_images ./ 255
test_images = reshape(test_images, 28, 28, 10000)
# Convert target to integer (from float) type and swap label 0 for 10
train_labels = Int.(train_labels)
train_labels[train_labels .== 0] = 10
test_labels = Int.(test_labels)
test_labels[test_labels .== 0] = 10
# Split training set into 50,000 training and 10,000 validation images
# Images
val_images = train_images[:, :, 50001:60000]
train_images = train_images[:, :, 1:50000]
# Labels
val_labels = train_labels[50001:60000]
train_labels = train_labels[1:50000];
In [8]:
# Set seed to be able to replicate
srand(123)
# Data Box and Input Layer
databox = Data(train_images, train_labels, val_images, val_labels)
batch_size = 128
input = InputLayer(databox, batch_size)
# Fully connected hidden layers
dim = 30
fc1 = FullyConnectedLayer(size(input), dim, activation = :tanh)
fc2 = FullyConnectedLayer(size(fc1), dim, activation = :tanh)
# Softmax Output Layer
num_classes = 10
output = SoftmaxOutputLayer(databox, size(fc2), num_classes)
# Model
λ = 1e-3 # Regularisation
net = NeuralNet(databox, [input, fc1, fc2, output], λ, regularisation=:L2)
Out[8]:
In [9]:
# Training parameters
num_epochs = 40 # number of epochs
α = 1e-2 # learning rate
μ = 0.9 # momentum param / viscosity
# Train
train(net, num_epochs, α, μ, nesterov=true, shuffle=true, last_train_every=2, full_train_every=10, val_every=10)
In [10]:
# PLot loss curves
Gadfly.set_default_plot_size(24cm, 12cm)
plot_loss_history(net, 2, 10, 10)
Out[10]:
In [11]:
# Set seed to be able to replicate
srand(123)
# Data Box and Input Layer
databox = Data(train_images, train_labels, val_images, val_labels, test_images, test_labels)
batch_size = 128
input = InputLayer(databox, batch_size)
# Convolution Layer
patch_dim = 5
num_patches = 20
patch_dims = (patch_dim, patch_dim, num_patches)
conv1 = ConvolutionLayer(size(input), patch_dims, activation = :relu)
# Pooling Layer
stride = 2
pool1 = MaxPoolLayer(size(conv1), stride)
# Fully Connected Layer
dim = 100
fc1 = FullyConnectedLayer(size(pool1), dim, activation = :relu)
# Softmax Output Layer
num_classes = 10
output = SoftmaxOutputLayer(databox, size(fc1), num_classes)
# Model
λ = 1e-3 # Regularisation
net = NeuralNet(databox, [input, conv1, pool1, fc1, output], λ, regularisation=:L2)
Out[11]:
In [12]:
# Training parameters
num_epochs = 40 # number of epochs
α = 1e-2 # learning rate
μ = 0.9 # momentum param / viscosity
# Training
train(net, num_epochs, α, μ, nesterov=true, shuffle=true, last_train_every=2, full_train_every=10, val_every=10)
In [13]:
plot_loss_history(net, 2, 10, 10)
Out[13]:
In [14]:
# Set seed to be able to replicate
srand(123)
# Data Box and Input Layer
databox = Data(train_images, train_labels, val_images, val_labels, test_images, test_labels)
batch_size = 128
input = InputLayer(databox, batch_size)
# Convolution Layer 1
patch_dim = 5
num_patches1 = 20
patch_dims = (patch_dim, patch_dim, num_patches1)
conv1 = ConvolutionLayer(size(input), patch_dims, activation = :relu)
# Pooling Layer 1
stride = 2
pool1 = MaxPoolLayer(size(conv1), stride)
# Convolution Layer 2
patch_dim = 5
num_patches2 = 40
patch_dims = (patch_dim, patch_dim, num_patches1, num_patches2)
conv2 = ConvolutionLayer(size(pool1), patch_dims, activation = :relu)
# Pooling Layer 2
stride = 2
pool2 = MaxPoolLayer(size(conv2), stride)
# Fully Connected Layer
dim = 100
fc1 = FullyConnectedLayer(size(pool2), dim, activation = :relu)
# Softmax Output Layer
num_classes = 10
output = SoftmaxOutputLayer(databox, size(fc1), num_classes)
# Model
λ = 1e-3 # Regularisation
net = NeuralNet(databox, [input, conv1, pool1, conv2, pool2, fc1, output], λ, regularisation=:L2)
Out[14]:
In [15]:
# Training parameters
num_epochs = 20 # number of epochs
α = 1e-2 # learning rate
μ = 0.9 # momentum param / viscosity
# Training
train(net, num_epochs, α, μ, nesterov=true, shuffle=true, last_train_every=2, full_train_every=10, val_every=10)
In [16]:
Gadfly.set_default_plot_size(24cm, 12cm)
plot_loss_history(net, 2, 10, 10)
Out[16]:
In [17]:
accuracy(net, test_images, test_labels)
Out[17]:
In [ ]: