2018-11-24 02:12:25


In [1]:
import numpy as np

In [5]:
# from fastai.core
def even_mults(start:float, stop:float, n:int)->np.ndarray:
    "Build evenly stepped schedule from `star` to `stop` in `n` steps."
    mult = stop/start
    step = mult**(1/(n-1))
    return np.array([start*(step**i) for i in range(n)])

let's say for a hypothetical network with 3 layer groups (conv_group_1, conv_group_2, linear_group).


In [21]:
layer_groups = ['conv_group_1', 'conv_group_2', 'linear_group']

In [17]:
def lr_range(lr:[float,slice])->np.ndarray:
    if not isinstance(lr, slice): return lr
    if lr.start: res = even_mults(lr.start, lr.stop, len(layer_groups))
    else: res = [lr.stop/3]*(len(layer_groups)-1)+[lr.stop]
    return np.array(res)

In [18]:
lr = slice(1e-3)
lr_range(lr)


Out[18]:
array([0.00033333, 0.00033333, 0.001     ])

In [19]:
lr = 1e-3
lr_range(lr)


Out[19]:
0.001

Interesting, so if you have multiple trainable layer groups, and pass in a slice with only a stop element, you'll get the lr for the last group, and the lr / 3 for all preceeding groups.


In [20]:
# 10 layer groups
layer_groups = [i for i in range(10)]
lr = slice(1e-3)
lr_range(lr)


Out[20]:
array([0.00033333, 0.00033333, 0.00033333, 0.00033333, 0.00033333,
       0.00033333, 0.00033333, 0.00033333, 0.00033333, 0.001     ])

Now what happens when I pass in a start and stop value:


In [22]:
lr = slice(1e-6, 1e-3)

In [23]:
lr_range(lr)


Out[23]:
array([1.00000000e-06, 3.16227766e-05, 1.00000000e-03])

In [31]:
1e-3/30


Out[31]:
3.3333333333333335e-05

In [30]:
1e-6*30


Out[30]:
2.9999999999999997e-05

In [32]:
(1e-3/30 + 1e-6/30)*2


Out[32]:
6.673333333333333e-05

This is so cool. Fastai finds the order / magnitude / exponential / logorithmic mean, not the absolute mean. This is why the step multiplier is (stop/start)**1/(n-1)) where n is the number of layer groups.

$$step = \big(\frac{stop}{start}\big)^{\frac{1}{n - 1}} ,\ \ \ n: \mathrm{number\ of\ layer\ groups}$$

In [34]:
even_mults(1e-6, 1e-3, 3)


Out[34]:
array([1.00000000e-06, 3.16227766e-05, 1.00000000e-03])

In [35]:
even_mults(1e-6, 1e-3, 10)


Out[35]:
array([1.00000000e-06, 2.15443469e-06, 4.64158883e-06, 1.00000000e-05,
       2.15443469e-05, 4.64158883e-05, 1.00000000e-04, 2.15443469e-04,
       4.64158883e-04, 1.00000000e-03])

So the question I have, and why I'm here, is: can I have discriminative learning rates with a magnitude separation of 3? So: $\frac{lr}{3^2}, \frac{lr}{3^1}, \frac{lr}{3^0} = $ lr/9, lr/3, lr


In [37]:
lr_stop = 1e-3
lr_start= lr_stop / 3**2

In [38]:
even_mults(lr_start, lr_stop, 3)


Out[38]:
array([0.00011111, 0.00033333, 0.001     ])

In [39]:
1e-3/9


Out[39]:
0.00011111111111111112

This is very exciting.


It also means, for my planet resnet34 thing, I don't need to worry about the internals of the learning rate calculation & assignment. I just need to specify the correct start and end lrs.

Which means all I have to do is provide the appropriate aggression for training. This I like.


In [40]:
(1/9 + 1)/2


Out[40]:
0.5555555555555556

In [47]:
5/9


Out[47]:
0.5555555555555556

In [48]:
even_mults(1/9, 1, 3)


Out[48]:
array([0.11111111, 0.33333333, 1.        ])

In [49]:
lr_range(3)


Out[49]:
3

In [57]:
even_mults(1e-10, 1, 11)


Out[57]:
array([1.e-10, 1.e-09, 1.e-08, 1.e-07, 1.e-06, 1.e-05, 1.e-04, 1.e-03,
       1.e-02, 1.e-01, 1.e+00])


In [2]:
from fastai import *
from fastai.vision import *
__version__


Out[2]:
'1.0.28'

In [3]:
import torchvision

In [4]:
path = untar_data(URLs.MNIST_TINY)
tfms = get_transforms()
data = (ImageItemList.from_folder(path).split_by_folder()
        .label_from_folder().transform(tfms).databunch())
learn = create_cnn(data, torchvision.models.inception_v3)


Downloading: "https://download.pytorch.org/models/inception_v3_google-1a9a5a14.pth" to /Users/WayNoxchi/.torch/models/inception_v3_google-1a9a5a14.pth
100%|██████████| 108857766/108857766 [00:12<00:00, 8630245.56it/s]
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-4-205a06a51732> in <module>
      3 data = (ImageItemList.from_folder(path).split_by_folder()
      4         .label_from_folder().transform(tfms).databunch())
----> 5 learn = create_cnn(data, torchvision.models.inception_v3)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/vision/learner.py in create_cnn(data, arch, cut, pretrained, lin_ftrs, ps, custom_head, split_on, classification, **kwargs)
     52     meta = cnn_config(arch)
     53     body = create_body(arch(pretrained), ifnone(cut,meta['cut']))
---> 54     nf = num_features_model(body) * 2
     55     head = custom_head or create_head(nf, data.c, lin_ftrs, ps)
     56     model = nn.Sequential(body, head)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/callbacks/hooks.py in num_features_model(m)
     86 def num_features_model(m:nn.Module)->int:
     87     "Return the number of output features for `model`."
---> 88     return model_sizes(m, full=False)[-1][1]
     89 
     90 def total_params(m:nn.Module) -> int:

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/callbacks/hooks.py in model_sizes(m, size, full)
     79     ch_in = in_channels(m)
     80     x = next(m.parameters()).new(1,ch_in,*size)
---> 81     x = m.eval()(x)
     82     res = [o.stored.shape for o in hooks]
     83     if not full: hooks.remove()

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/container.py in forward(self, input)
     90     def forward(self, input):
     91         for module in self._modules.values():
---> 92             input = module(input)
     93         return input
     94 

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torchvision/models/inception.py in forward(self, x)
    306         x = self.conv0(x)
    307         # 5 x 5 x 128
--> 308         x = self.conv1(x)
    309         # 1 x 1 x 768
    310         x = x.view(x.size(0), -1)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torchvision/models/inception.py in forward(self, x)
    323 
    324     def forward(self, x):
--> 325         x = self.conv(x)
    326         x = self.bn(x)
    327         return F.relu(x, inplace=True)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
    311     def forward(self, input):
    312         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 313                         self.padding, self.dilation, self.groups)
    314 
    315 

RuntimeError: Calculated padded input size per channel: (3 x 3). Kernel size: (5 x 4294967301). Kernel size can't be greater than actual input size at /Users/administrator/nightlies/2018_10_14/wheel_build_dirs/conda_3.7/conda/conda-bld/pytorch-nightly-cpu_1539519280889/work/aten/src/THNN/generic/SpatialConvolutionMM.c:50

In [22]:
??models.resnet18

In [7]:
??torchvision.models.inception_v3

In [25]:
def inception_v3_2(pretrained=False, **kwargs):
    r"""Inception v3 model architecture from
    `"Rethinking the Inception Architecture for Computer Vision" <http://arxiv.org/abs/1512.00567>`_.

    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
    """
    model = torchvision.models.Inception3(**kwargs)
#     if pretrained:
#         if 'transform_input' not in kwargs:
#             kwargs['transform_input'] = True
#         model.load_state_dict(model_zoo.load_url(model_urls['inception_v3_google']))
    return model

In [26]:
create_cnn(data, inception_v3_2)


---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-26-6752b14fe4db> in <module>
----> 1 create_cnn(data, inception_v3_2)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/vision/learner.py in create_cnn(data, arch, cut, pretrained, lin_ftrs, ps, custom_head, split_on, classification, **kwargs)
     52     meta = cnn_config(arch)
     53     body = create_body(arch(pretrained), ifnone(cut,meta['cut']))
---> 54     nf = num_features_model(body) * 2
     55     head = custom_head or create_head(nf, data.c, lin_ftrs, ps)
     56     model = nn.Sequential(body, head)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/callbacks/hooks.py in num_features_model(m)
     86 def num_features_model(m:nn.Module)->int:
     87     "Return the number of output features for `model`."
---> 88     return model_sizes(m, full=False)[-1][1]
     89 
     90 def total_params(m:nn.Module) -> int:

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/callbacks/hooks.py in model_sizes(m, size, full)
     79     ch_in = in_channels(m)
     80     x = next(m.parameters()).new(1,ch_in,*size)
---> 81     x = m.eval()(x)
     82     res = [o.stored.shape for o in hooks]
     83     if not full: hooks.remove()

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/container.py in forward(self, input)
     90     def forward(self, input):
     91         for module in self._modules.values():
---> 92             input = module(input)
     93         return input
     94 

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torchvision/models/inception.py in forward(self, x)
    306         x = self.conv0(x)
    307         # 5 x 5 x 128
--> 308         x = self.conv1(x)
    309         # 1 x 1 x 768
    310         x = x.view(x.size(0), -1)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torchvision/models/inception.py in forward(self, x)
    323 
    324     def forward(self, x):
--> 325         x = self.conv(x)
    326         x = self.bn(x)
    327         return F.relu(x, inplace=True)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
    311     def forward(self, input):
    312         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 313                         self.padding, self.dilation, self.groups)
    314 
    315 

RuntimeError: Calculated padded input size per channel: (3 x 3). Kernel size: (5 x 4294967301). Kernel size can't be greater than actual input size at /Users/administrator/nightlies/2018_10_14/wheel_build_dirs/conda_3.7/conda/conda-bld/pytorch-nightly-cpu_1539519280889/work/aten/src/THNN/generic/SpatialConvolutionMM.c:50

In [ ]:


In [ ]:
??learn.fit_one_cycle

In [ ]:
??learn.lr_range

In [ ]:
??even_mults

In [ ]: