CNNs and MLPs can be trained and used by describing the layers composing the neural network. For CNNs we're using convolutional related layers plus flat layers, while with MLPs we're only using the later ones. Also inputs for CNNs must be 4 dimensional, while for MLPs must be 2 dimensional
In [1]:
library(rcnn);
library(ggplot2);
library(reshape2);
In [2]:
data(mnist)
Datasets must be shaped as [batch_size x channels x height x width]
In [3]:
img_size <- c(28,28);
training_x <- array(mnist$train$x, c(nrow(mnist$train$x), 1, img_size)) / 255;
training_y <- binarization(mnist$train$y);
testing_x <- array(mnist$test$x, c(nrow(mnist$test$x), 1, img_size)) / 255;
testing_y <- binarization(mnist$test$y);
A visualization of a sample of the training Dataset
In [4]:
printset <- training_x[1:48,1,,,drop=TRUE];
longData <- melt(printset)
longData$Var2 <- (longData$Var2 - 1) %% 28 + 1;
longData$Var3 <- 29 - longData$Var3;
longData$Var1 <- paste("Img", longData$Var1, ": ", mnist$train$y[1:48], sep="")
ggplot(data = longData) +
geom_tile(aes(Var2, Var3, fill = value)) +
facet_wrap(~Var1, nrow = 6) +
scale_fill_continuous(low = 'white', high = 'black') +
coord_equal() +
labs(x = NULL, y = NULL, title = "Sample of Training Dataset") +
theme(legend.position = "none")
The layer descriptor must be a list of vectors with the hyperparameters. Check the help for train.cnn to see the list of layers and the properties of each kind
In [5]:
layers <- list(
c('type' = "CONV", 'n_channels' = 1, 'n_filters' = 4, 'filter_size' = 5, 'scale' = 0.1, 'border_mode' = 'same'),
c('type' = "POOL", 'n_channels' = 4, 'scale' = 0.1, 'win_size' = 3, 'stride' = 2),
c('type' = "RELU", 'n_channels' = 4),
c('type' = "FLAT", 'n_channels' = 4),
c('type' = "LINE", 'n_visible' = 784, 'n_hidden' = 10, 'scale' = 0.1),
c('type' = "SOFT", 'n_inputs' = 10)
);
The CNN receives as inputs:
Also receives the following hyperparameters:
In [6]:
mnist_cnn <- train.cnn(training_x,
training_y,
layers,
batch_size = 10,
training_epochs = 20,
learning_rate = 1e-3,
rand_seed = 1234
);
In [7]:
prediction <- predict(mnist_cnn, testing_x);
str(prediction)
Plot some Results
In [8]:
printset <- testing_x[1:48,1,,,drop=TRUE];
longData <- melt(printset)
longData$Var2 <- (longData$Var2 - 1) %% 28 + 1;
longData$Var3 <- 29 - longData$Var3;
longData$Var1 <- paste("Img", longData$Var1, ": ", prediction$class[1:48] - 1, sep="")
ggplot(data = longData) +
geom_tile(aes(Var2, Var3, fill = value)) +
facet_wrap(~Var1, nrow = 6) +
scale_fill_continuous(low = 'white', high = 'black') +
coord_equal() +
labs(x = NULL, y = NULL, title = "Sample of Training Dataset") +
theme(legend.position = "none")
In [9]:
table(prediction$class - 1, mnist$test$y);
In [10]:
mnist_cnn_update <- train.cnn(training_x,
training_y,
batch_size = 10,
training_epochs = 3,
learning_rate = 1e-3,
rand_seed = 1234,
init_cnn = mnist_cnn
);
In [11]:
rm (list = ls());
In [12]:
setwd("..");
source("./cnn.R");
setwd("./notebooks");
In [13]:
mnist <- readRDS("../datasets/mnist.rds");
In [14]:
img_size <- c(28,28);
training_x <- array(mnist$train$x, c(nrow(mnist$train$x), 1, img_size)) / 255;
training_y <- binarization(mnist$train$y);
testing_x <- array(mnist$test$x, c(nrow(mnist$test$x), 1, img_size)) / 255;
testing_y <- binarization(mnist$test$y);
Let's reduce the dataset size for this example
training_x <- training_x[1:1000,,,, drop=FALSE]; training_y <- training_y[1:1000,, drop=FALSE];
testing_x <- testing_x[1:1000,,,, drop=FALSE]; testing_y <- testing_y[1:1000,, drop=FALSE];
In [15]:
layers <- list(
c('type' = "CONV", 'n_channels' = 1, 'n_filters' = 4, 'filter_size' = 5, 'scale' = 0.1, 'border_mode' = 'same'),
c('type' = "POOL", 'n_channels' = 4, 'scale' = 0.1, 'win_size' = 3, 'stride' = 2),
c('type' = "RELU", 'n_channels' = 4),
c('type' = "FLAT", 'n_channels' = 4),
c('type' = "LINE", 'n_visible' = 784, 'n_hidden' = 10, 'scale' = 0.1),
c('type' = "SOFT", 'n_inputs' = 10)
);
The native R CNN is trained like the one in the package
In [16]:
cnn1 <- train_cnn(training_x = training_x,
training_y = training_y,
layers = layers,
batch_size = 10,
training_epochs = 20,
learning_rate = 1e-3,
rand_seed = 1234
);
In the native version, predict_cnn is not seen as an S3 method
In [17]:
prediction <- predict_cnn(cnn1, testing_x);
str(prediction)
In [18]:
cnn1_update <- train_cnn(training_x = training_x,
training_y = training_y,
layers = layers,
batch_size = 10,
training_epochs = 3,
learning_rate = 1e-3,
rand_seed = 1234,
init_cnn = cnn1
);
In [19]:
#save.image(file="rcnn.data")