Stochastic depth

Dropout proved to be a working tool that improves the stability of a neural network. Essentially, dropout shuts down some neurons of a specific layer. Gao Huang, Yu Sun, Zhuang Liu What in the article "Deep Networks with Stochastic Depth" went into further and attempted to shut down whole blocks of layers.

In this notebook, we will investigate whether the stochastic depth improves accuracy of neural networks.

Pay attention to the file if you want to know how the Stochastic ResNet is implemented.

In [1]:
import sys

import matplotlib.pyplot as plt
from tqdm import tqdm_notebook as tqn
%matplotlib inline

import utils
from resnet_with_stochastic_depth import StochasticResNet
from batchflow import B,V,F
from batchflow.opensets import MNIST
from import ResNet50

In our expements we will work with MNIST dataset

In [2]:
dset = MNIST()

Firstly, let us define the shape of inputs of our model, loss function and an optimizer:

In [3]:
ResNet_config = {
    'inputs': {'images': {'shape': (28, 28, 1)},
               'labels': {'classes': (10),
                          'transform': 'ohe',
                          'dtype': 'int64',
                          'name': 'targets'}},
    'input_block/inputs': 'images',
    'loss': 'softmax_cross_entropy',
    'optimizer': 'Adam',
    'output': dict(ops=['accuracy'])

Stochastic_config = {**ResNet_config}

Secondly, we create pipelines for train and test Simple ResNet model

In [8]:
res_train_ppl = (dset.train.p
                              feed_dict={'images': B('images'),
                                         'labels': B('labels')}))
res_test_ppl = (dset.test.p
                .init_variable('resacc', init_on_each_run=list)
                .import_model('resnet', res_train_ppl)
                               feed_dict={'images': B('images'),
                                          'labels': B('labels')},

The same thing for Stochastic ResNet model

In [9]:
stochastic_train_ppl = (dset.train.p
                        .init_variable('stochasticacc', init_on_each_run=list)
                                     feed_dict={'images': B('images'),
                                                'labels': B('labels')}))
stochastic_test_ppl = (dset.test.p
                       .init_variable('stochasticacc', init_on_each_run=list)
                       .import_model('stochastic', stochastic_train_ppl)
                                      feed_dict={'images': B('images'),
                                                 'labels': B('labels')},

Let's train our models

In [17]:
for i in tqn(range(1000)):
    res_train_ppl.next_batch(400, n_epochs=None, shuffle=True)
    res_test_ppl.next_batch(400, n_epochs=None, shuffle=True)
    stochastic_train_ppl.next_batch(400, n_epochs=None, shuffle=True)
    stochastic_test_ppl.next_batch(400, n_epochs=None, shuffle=True)

Show test accuracy for all iterations

In [20]:
resnet_loss = res_test_ppl.get_variable('resacc')
stochastic_loss = stochastic_test_ppl.get_variable('stochasticacc')
utils.draw(resnet_loss, 'ResNet', stochastic_loss, 'Stochastic', window=20, type_data='accuracy')

It can be seen that the model with stochastic depth has a big variance, but reaches a similar quality with the usual model


  • Our experement don't show any increase of accuracy for Stochastic ResNet.
  • Dropping blocks from the network strongly affects the variance of the output.
  • With the passage of time, this dispersion does not decrease.

And what's next?

  • In our experiment, we chose a certain shutdown threshold. You can choose another one to achieve better quality.
  • If you still have not completed our tutorial, you can fix it right now!
  • Read and apply another experiments: