2018-11-24 02:12:25



In [1]:

    
import numpy as np



In [5]:

    
# from fastai.core
def even_mults(start:float, stop:float, n:int)->np.ndarray:
    "Build evenly stepped schedule from `star` to `stop` in `n` steps."
    mult = stop/start
    step = mult**(1/(n-1))
    return np.array([start*(step**i) for i in range(n)])

let's say for a hypothetical network with 3 layer groups (conv_group_1, conv_group_2, linear_group).



In [21]:

    
layer_groups = ['conv_group_1', 'conv_group_2', 'linear_group']



In [17]:

    
def lr_range(lr:[float,slice])->np.ndarray:
    if not isinstance(lr, slice): return lr
    if lr.start: res = even_mults(lr.start, lr.stop, len(layer_groups))
    else: res = [lr.stop/3]*(len(layer_groups)-1)+[lr.stop]
    return np.array(res)



In [18]:

    
lr = slice(1e-3)
lr_range(lr)









    Out[18]:





array([0.00033333, 0.00033333, 0.001     ])



In [19]:

    
lr = 1e-3
lr_range(lr)









    Out[19]:





0.001

Interesting, so if you have multiple trainable layer groups, and pass in a slice with only a stop element, you'll get the lr for the last group, and the lr / 3 for all preceeding groups.



In [20]:

    
# 10 layer groups
layer_groups = [i for i in range(10)]
lr = slice(1e-3)
lr_range(lr)









    Out[20]:





array([0.00033333, 0.00033333, 0.00033333, 0.00033333, 0.00033333,
       0.00033333, 0.00033333, 0.00033333, 0.00033333, 0.001     ])

Now what happens when I pass in a start and stop value:



In [22]:

    
lr = slice(1e-6, 1e-3)



In [23]:

    
lr_range(lr)









    Out[23]:





array([1.00000000e-06, 3.16227766e-05, 1.00000000e-03])



In [31]:

    
1e-3/30









    Out[31]:





3.3333333333333335e-05



In [30]:

    
1e-6*30









    Out[30]:





2.9999999999999997e-05



In [32]:

    
(1e-3/30 + 1e-6/30)*2









    Out[32]:





6.673333333333333e-05

This is so cool. Fastai finds the order / magnitude / exponential / logorithmic mean, not the absolute mean. This is why the step multiplier is (stop/start)**1/(n-1)) where n is the number of layer groups.

$$step = \big(\frac{stop}{start}\big)^{\frac{1}{n - 1}} ,\ \ \ n: \mathrm{number\ of\ layer\ groups}$$



In [34]:

    
even_mults(1e-6, 1e-3, 3)









    Out[34]:





array([1.00000000e-06, 3.16227766e-05, 1.00000000e-03])



In [35]:

    
even_mults(1e-6, 1e-3, 10)









    Out[35]:





array([1.00000000e-06, 2.15443469e-06, 4.64158883e-06, 1.00000000e-05,
       2.15443469e-05, 4.64158883e-05, 1.00000000e-04, 2.15443469e-04,
       4.64158883e-04, 1.00000000e-03])

So the question I have, and why I'm here, is: can I have discriminative learning rates with a magnitude separation of 3? So: $\frac{lr}{3^2}, \frac{lr}{3^1}, \frac{lr}{3^0} = $ lr/9, lr/3, lr



In [37]:

    
lr_stop = 1e-3
lr_start= lr_stop / 3**2



In [38]:

    
even_mults(lr_start, lr_stop, 3)









    Out[38]:





array([0.00011111, 0.00033333, 0.001     ])



In [39]:

    
1e-3/9









    Out[39]:





0.00011111111111111112

This is very exciting.

It also means, for my planet resnet34 thing, I don't need to worry about the internals of the learning rate calculation & assignment. I just need to specify the correct start and end lrs.

Which means all I have to do is provide the appropriate aggression for training. This I like.



In [40]:

    
(1/9 + 1)/2









    Out[40]:





0.5555555555555556



In [47]:

    
5/9









    Out[47]:





0.5555555555555556



In [48]:

    
even_mults(1/9, 1, 3)









    Out[48]:





array([0.11111111, 0.33333333, 1.        ])



In [49]:

    
lr_range(3)









    Out[49]:





3



In [57]:

    
even_mults(1e-10, 1, 11)









    Out[57]:





array([1.e-10, 1.e-09, 1.e-08, 1.e-07, 1.e-06, 1.e-05, 1.e-04, 1.e-03,
       1.e-02, 1.e-01, 1.e+00])



In [2]:

    
from fastai import *
from fastai.vision import *
__version__









    Out[2]:





'1.0.28'



In [3]:

    
import torchvision



In [4]:

    
path = untar_data(URLs.MNIST_TINY)
tfms = get_transforms()
data = (ImageItemList.from_folder(path).split_by_folder()
        .label_from_folder().transform(tfms).databunch())
learn = create_cnn(data, torchvision.models.inception_v3)









    



Downloading: "https://download.pytorch.org/models/inception_v3_google-1a9a5a14.pth" to /Users/WayNoxchi/.torch/models/inception_v3_google-1a9a5a14.pth
100%|██████████| 108857766/108857766 [00:12<00:00, 8630245.56it/s]






    



---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-4-205a06a51732> in <module>
      3 data = (ImageItemList.from_folder(path).split_by_folder()
      4         .label_from_folder().transform(tfms).databunch())
----> 5 learn = create_cnn(data, torchvision.models.inception_v3)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/vision/learner.py in create_cnn(data, arch, cut, pretrained, lin_ftrs, ps, custom_head, split_on, classification, **kwargs)
     52     meta = cnn_config(arch)
     53     body = create_body(arch(pretrained), ifnone(cut,meta['cut']))
---> 54     nf = num_features_model(body) * 2
     55     head = custom_head or create_head(nf, data.c, lin_ftrs, ps)
     56     model = nn.Sequential(body, head)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/callbacks/hooks.py in num_features_model(m)
     86 def num_features_model(m:nn.Module)->int:
     87     "Return the number of output features for `model`."
---> 88     return model_sizes(m, full=False)[-1][1]
     89 
     90 def total_params(m:nn.Module) -> int:

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/callbacks/hooks.py in model_sizes(m, size, full)
     79     ch_in = in_channels(m)
     80     x = next(m.parameters()).new(1,ch_in,*size)
---> 81     x = m.eval()(x)
     82     res = [o.stored.shape for o in hooks]
     83     if not full: hooks.remove()

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/container.py in forward(self, input)
     90     def forward(self, input):
     91         for module in self._modules.values():
---> 92             input = module(input)
     93         return input
     94 

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torchvision/models/inception.py in forward(self, x)
    306         x = self.conv0(x)
    307         # 5 x 5 x 128
--> 308         x = self.conv1(x)
    309         # 1 x 1 x 768
    310         x = x.view(x.size(0), -1)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torchvision/models/inception.py in forward(self, x)
    323 
    324     def forward(self, x):
--> 325         x = self.conv(x)
    326         x = self.bn(x)
    327         return F.relu(x, inplace=True)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
    311     def forward(self, input):
    312         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 313                         self.padding, self.dilation, self.groups)
    314 
    315 

RuntimeError: Calculated padded input size per channel: (3 x 3). Kernel size: (5 x 4294967301). Kernel size can't be greater than actual input size at /Users/administrator/nightlies/2018_10_14/wheel_build_dirs/conda_3.7/conda/conda-bld/pytorch-nightly-cpu_1539519280889/work/aten/src/THNN/generic/SpatialConvolutionMM.c:50



In [22]:

    
??models.resnet18



In [7]:

    
??torchvision.models.inception_v3



In [25]:

    
def inception_v3_2(pretrained=False, **kwargs):
    r"""Inception v3 model architecture from
    `"Rethinking the Inception Architecture for Computer Vision" <http://arxiv.org/abs/1512.00567>`_.

    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
    """
    model = torchvision.models.Inception3(**kwargs)
#     if pretrained:
#         if 'transform_input' not in kwargs:
#             kwargs['transform_input'] = True
#         model.load_state_dict(model_zoo.load_url(model_urls['inception_v3_google']))
    return model



In [26]:

    
create_cnn(data, inception_v3_2)









    



---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-26-6752b14fe4db> in <module>
----> 1 create_cnn(data, inception_v3_2)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/vision/learner.py in create_cnn(data, arch, cut, pretrained, lin_ftrs, ps, custom_head, split_on, classification, **kwargs)
     52     meta = cnn_config(arch)
     53     body = create_body(arch(pretrained), ifnone(cut,meta['cut']))
---> 54     nf = num_features_model(body) * 2
     55     head = custom_head or create_head(nf, data.c, lin_ftrs, ps)
     56     model = nn.Sequential(body, head)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/callbacks/hooks.py in num_features_model(m)
     86 def num_features_model(m:nn.Module)->int:
     87     "Return the number of output features for `model`."
---> 88     return model_sizes(m, full=False)[-1][1]
     89 
     90 def total_params(m:nn.Module) -> int:

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/fastai/callbacks/hooks.py in model_sizes(m, size, full)
     79     ch_in = in_channels(m)
     80     x = next(m.parameters()).new(1,ch_in,*size)
---> 81     x = m.eval()(x)
     82     res = [o.stored.shape for o in hooks]
     83     if not full: hooks.remove()

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/container.py in forward(self, input)
     90     def forward(self, input):
     91         for module in self._modules.values():
---> 92             input = module(input)
     93         return input
     94 

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torchvision/models/inception.py in forward(self, x)
    306         x = self.conv0(x)
    307         # 5 x 5 x 128
--> 308         x = self.conv1(x)
    309         # 1 x 1 x 768
    310         x = x.view(x.size(0), -1)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torchvision/models/inception.py in forward(self, x)
    323 
    324     def forward(self, x):
--> 325         x = self.conv(x)
    326         x = self.bn(x)
    327         return F.relu(x, inplace=True)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/Miniconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
    311     def forward(self, input):
    312         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 313                         self.padding, self.dilation, self.groups)
    314 
    315 

RuntimeError: Calculated padded input size per channel: (3 x 3). Kernel size: (5 x 4294967301). Kernel size can't be greater than actual input size at /Users/administrator/nightlies/2018_10_14/wheel_build_dirs/conda_3.7/conda/conda-bld/pytorch-nightly-cpu_1539519280889/work/aten/src/THNN/generic/SpatialConvolutionMM.c:50



In [ ]:



In [ ]:

    
??learn.fit_one_cycle



In [ ]:

    
??learn.lr_range



In [ ]:

    
??even_mults



In [ ]: