Now it's time to talk about convolutional neural networks and in this notebook you will find out how to do:
In [1]:
import sys
import warnings
warnings.filterwarnings("ignore")
import numpy as np
import PIL
from matplotlib import pyplot as plt
from tqdm import tqdm
%matplotlib inline
# the following line is not required if BatchFlow is installed as a python package.
sys.path.append('../..')
from batchflow import D, B, V, C, R, P
from batchflow.utils import plot_images
from batchflow.opensets import MNIST
from batchflow.models.tf import TFModel
from batchflow.models.torch import TorchModel
from batchflow.models.metrics import ClassificationMetrics
plt.style.use('ggplot')
You don't need to implement a MNIST dataset. It is already done for you.
In [2]:
mnist = MNIST()
We can use deep learning frameworks such as TensorFlow or PyTorch to make a neural network. These frameworks have a lot of differences under the hood. Batchflow allows us not to dive deep into each of them and use the same model configuration, thereby allowing us to build framework-agnostic models.
But before, we should define model class 'model'
and channels positions 'channels'
(for TensorFlow models - 'last', for PyTorch models - 'first') in config
.
There are also predefined models of both frameworks. You can use them without additional configuration.
In [3]:
config = {
'model': TorchModel,
'channels': 'first'}
# or for TensorFlow model
# config = {
# 'model': TFModel,
# 'channels': 'last'}
As we already learned from the previous tutorials, first of all you have to define model configuration and create train and test pipelines.
A little bit about the structure of batchflow model:
Let's create a dict with configuration for our model — model_config
. This dict is used when model is initialized. You can override default parameters or add new parameters by typing in a model_config
key like 'body/layout' and params to this key. Similar way use it in the key 'initial_block/inputs' or 'head/units'.
The main parameter of each architecture is 'layout'. It is a sequence of letters, each letter meaning operation. For example, operations in our model:
In our configuration 'body/filters', 'body/kernel_size' are lists with a length equal to the number of convolutions, store individual parameters for each convolution. And 'body/strides' is an integer — therefore, the same value is used for all convolutional layers.
In docs you can read more.
In [4]:
model_config = {
'inputs/images/shape': B.image_shape,
'inputs/labels/classes': D.num_classes,
'initial_block/inputs': 'images',
'body': {'layout': 'cna cna cna',
'filters': [16, 32, 64],
'kernel_size': [7, 5, 3],
'strides': 2},
'head': {'layout': 'Pf',
'units': 10},
'loss': 'ce',
'optimizer': 'Adam',
'output': dict(predicted=['proba', 'labels'])
}
We define our custom function for data augmentation.
In [5]:
def custom_filter(image, kernel_weights=None):
""" Apply filter with custom kernel to image
Parameters
----------
kernel_weights: np.array
Weights of kernel.
Returns
-------
filtered image """
if kernel_weights is None:
kernel_weights = np.ones((3,3))
kernel_weights[1][1] = 10
kernel = PIL.ImageFilter.Kernel(kernel_weights.shape, kernel_weights.ravel())
return image.filter(kernel)
When config is defined, next step is to create a pipeline. Note that rotate and scale are methods of the ImagesBatch class. You can see all avalible augmentations in images tutorial.
In contrast to them apply_transform is a function from Batch class. It is worth mentioning because it runs our function custom_filter
in parallel. About parallel method read docs.
In [6]:
train_pipeline = (
mnist.train.p
.init_variable('loss_history', default=[])
.init_model('dynamic', C('model'), 'conv', config=model_config)
.apply_transform(custom_filter, src='images', p=0.8)
.shift(offset=P(R('randint', 8, size=2)), p=0.8)
.rotate(angle=P(R('uniform', -10, 10)), p=0.8)
.scale(factor=P(R('uniform', 0.8, 1.2, size=R([1, 2]))), preserve_shape=True, p=0.8)
.to_array(channels=C('channels'), dtype=np.float32)
.multiply(multiplier=1/255)
.train_model('conv', fetches='loss', images=B('images'), targets=B('labels'),
save_to=V('loss_history', mode='a'))
) << config
Testing on the augmented data
In [7]:
validation_pipeline = (
mnist.test.p
.init_variable('predictions')
.init_variable('metrics', default=None)
.import_model('conv', train_pipeline)
.apply_transform(custom_filter, src='images', p=0.8)
.shift(offset=P(R('randint', 8, size=2)), p=0.8)
.rotate(angle=P(R('uniform', -10, 10)), p=0.8)
.scale(factor=P(R('uniform', 0.8, 1.2, size=R([1, 2]))), preserve_shape=True, p=0.8)
.to_array(channels=C('channels'), dtype=np.float32)
.multiply(multiplier=1/255)
.predict_model('conv', images=B('images'),
fetches='predictions', save_to=V('predictions'))
.gather_metrics(ClassificationMetrics, targets=B('labels'), predictions=V('predictions'),
fmt='logits', axis=-1, save_to=V('metrics', mode='a'))
) << config
We introduce an early stopping to terminate the model training when an average accuracy for a few last epochs will exceed 90 percent.
In [8]:
MAX_ITER = 500
FREQUENCY = N_LAST = 20
batch_size = 128
for curr_iter in tqdm(range(1, MAX_ITER + 1)):
train_pipeline.next_batch(batch_size)
validation_pipeline.next_batch(batch_size)
if curr_iter % FREQUENCY == 0:
metrics = validation_pipeline.v('metrics')
accuracy = metrics[-N_LAST:].evaluate('accuracy')
#Early stopping
if accuracy > 0.9:
print('Early stop on {} iteration. Accuracy: {}'.format(curr_iter, accuracy))
break
Take a look at the loss history during training.
In [9]:
plt.figure(figsize=(15, 5))
plt.plot(train_pipeline.v('loss_history'))
plt.xlabel("Iterations"), plt.ylabel("Loss")
plt.show()
Our network is ready for inference. Now we don't use data augmentations. Let's take a look at the predictions.
In [10]:
inference_pipeline = (mnist.test.p
.init_variables('proba', 'labels')
.import_model('conv', train_pipeline)
.to_array(channels=C('channels'), dtype=np.float32)
.multiply(multiplier=1/255)
.predict_model('conv', images=B('images'),
fetches=['predicted_proba', 'predicted_labels'],
save_to=[V('proba'), V('labels')])) << config
It's always interesting to look at the images, so let's draw them.
In [11]:
batch = inference_pipeline.next_batch(12, shuffle=True)
plot_images(np.squeeze(batch.images), batch.labels,
batch.pipeline.v('proba'), ncols=4, figsize=(30, 35))
Today you have learnt how to:
And found out that network can be more robust with data augmentations (like scale and rotate).
Because now you know how to work with convolutional models, you can:
Your goal is 0.95 on the test data!
Good luck!
See image augmentation tutorial for advanced augmentation methods or return to the table of contents.