ImageNet Sample Prelims

Preparing to train a new model on ImageNet by going through a sample set of it first.

Imports, Paths


In [9]:
%matplotlib inline
%reload_ext autoreload
%autoreload 2

In [10]:
from fastai.conv_learner import *
from fastai.models import darknet

In [11]:
from pathlib import Path
import os
import pandas as pd

In [12]:
PATH = Path('data/imagenet')
PATH_TRAIN = PATH/'train'

In [13]:
PATH, PATH_TRAIN


Out[13]:
(PosixPath('data/imagenet'), PosixPath('data/imagenet/train'))

I. Taking a look at Data


In [6]:
folder = os.listdir(PATH_TRAIN)[0]
os.listdir(PATH_TRAIN/folder)


Out[6]:
['n02894605_44136.JPEG',
 'n02894605_70618.JPEG',
 'n02894605_57969.JPEG',
 'n02894605_57217.JPEG',
 'n02894605_26115.JPEG',
 'n02894605_79044.JPEG',
 'n02894605_36918.JPEG',
 'n02894605_74409.JPEG',
 'n02894605_57337.JPEG',
 'n02894605_26864.JPEG',
 'n02894605_15211.JPEG',
 'n02894605_61478.JPEG',
 'n02894605_103868.JPEG',
 'n02894605_86482.JPEG',
 'n02894605_25816.JPEG',
 'n02894605_121711.JPEG',
 'n02894605_29318.JPEG',
 'n02894605_24721.JPEG',
 'n02894605_51257.JPEG',
 'n02894605_29824.JPEG',
 'n02894605_13652.JPEG',
 'n02894605_34440.JPEG',
 'n02894605_57537.JPEG',
 'n02894605_2436.JPEG',
 'n02894605_64529.JPEG',
 'n02894605_36723.JPEG',
 'n02894605_24755.JPEG']

In [16]:
fimg = PATH_TRAIN / folder / os.listdir(PATH_TRAIN/folder)[0]
Image.open(fimg)


Out[16]:

In [6]:
def view_img(folder_path, idx=-1):
    files = os.listdir(folder_path)
    if idx < 0: idx = np.random.randint(len(files))
        
    fimg = str(folder_path / files[idx])
    img = cv2.imread(str(fimg))
    plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))

In [49]:
view_img(PATH_TRAIN/folder, idx=0)



In [50]:
view_img(PATH_TRAIN/folder)



In [51]:
view_img(PATH_TRAIN/folder)



In [52]:
view_img(PATH_TRAIN/folder)



In [12]:
# os.listdir(PATH_TRAIN)

# folders = os.listdir(PATH_TRAIN)
# for folder in folders:
#     print(f'{len(os.listdir(PATH_TRAIN/folder))}')

II. ImageNet labels lookup dictionary


In [7]:
imagenet_labels = pd.read_csv(PATH/'imagenet_labels.txt', delim_whitespace=True, header=None)

In [8]:
imagenet_labels.head()


Out[8]:
0 1 2
0 n02119789 1 kit_fox
1 n02100735 2 English_setter
2 n02110185 3 Siberian_husky
3 n02096294 4 Australian_terrier
4 n02102040 5 English_springer

I'm not sure whether I should keep the class number or not - since I think that's an internal detail that's handled automatically - so I'll just make a CSV matching folder codes to class names.


In [26]:
# imagenet_labels = imagenet_labels.drop(columns=[1])

On second thought, I can just use the .from_paths fastai method, and have a dictionary to lookup classes.


In [35]:
imagenet_labels.as_matrix()


Out[35]:
array([['n02119789', 1, 'kit_fox'],
       ['n02100735', 2, 'English_setter'],
       ['n02110185', 3, 'Siberian_husky'],
       ...,
       ['n04325704', 998, 'stole'],
       ['n07831146', 999, 'carbonara'],
       ['n03255030', 1000, 'dumbbell']], dtype=object)

In [41]:
# {c0 : c2 for c0,_,c2 in imagenet_labels.as_matrix()}

In [9]:
imagenet_labels_lookup = {c0 : c2 for c0,_,c2 in imagenet_labels.as_matrix()}

III. Validation Set


In [9]:
def reset_valset(path):
    path_val = path/'valid'
    path_trn = path/'train'
    
    if not os.path.exists(path_val):
        print('No validation directory to reset.')
        return
    
    for folder in path_val.iterdir():
        for f in folder.iterdir():
            os.rename(f, path_trn / str(f).split('valid/')[-1])

def create_valset(path, p=0.15, seed=0):
    np.random.seed(seed=seed)
    
    path_val = path/'valid'
    path_trn = path/'train'
    reset_valset(path)
    
    # move random p-percent selection from train/ to valid/
    for folder in path_trn.iterdir():
        os.makedirs(path_val/str(folder).split('train/')[-1], exist_ok=True)
        flist = list(folder.iterdir())
        n_move = int(np.round(len(flist) * p))
        fmoves = np.random.choice(flist, n_move, replace=False)
        
        for f in fmoves:
            os.rename(f, path_val / str(f).split('train/')[-1])

def count_files(path):
    count = 0
    for folder in path.iterdir():
        count += len(list(folder.glob('*')))
    return count

In [161]:
count_files(PATH_TRAIN)


Out[161]:
19439

In [164]:
reset_valset(PATH)


No validation directory to reset.

In [208]:
create_valset(PATH)

In [209]:
count_files(PATH_TRAIN)


Out[209]:
16545

In [211]:
19439 * (1 - .15)


Out[211]:
16523.149999999998

In [11]:
count_files(PATH/'valid')


Out[11]:
2894

In [20]:
reset_valset(PATH)
count_files(PATH_TRAIN), count_files(PATH/'valid')


Out[20]:
(19439, 0)

In [217]:
folder = next(iter(PATH_TRAIN.iterdir()))
view_img(folder)



In [218]:
folder = next(iter((PATH/'valid').iterdir()))
view_img(folder)


IV. Training Darknet on ImageNet sampleset


In [13]:
sz = 256
bs = 32

darknet53 = darknet.darknet_53()

# tfms = tfms_from_stats(imagenet_stats, sz, aug_tfms=transforms_side_on, max_zoom=1.05, pad=sz//8)

# tfms = tfms_from_model(darknet53, sz) # loads imagenet_stats
tfms = tfms_from_model(resnet34, sz)

model_data = ImageClassifierData.from_paths(PATH, bs=bs, tfms=tfms, num_workers=4)

In [14]:
learner = ConvLearner.from_model_data(darknet53, data)
# learner = ConvLearner.from_model_data(darknet53, model_data)
# learner.crit = F.nll_loss

In [15]:
learner.crit


Out[15]:
<function torch.nn.functional.nll_loss(input, target, weight=None, size_average=True, ignore_index=-100, reduce=True)>

In [16]:
# learner.summary()

In [ ]:
learner.lr_find()
learner.sched.plot(10,5)


  0%|          | 4/828 [00:13<44:40,  3.25s/it, loss=-0.00232] 

Copying Sgugger ModelData way



In [14]:
sz = 256
bs = 32

tfms = tfms_from_stats(imagenet_stats, sz, aug_tfms=transforms_side_on, max_zoom=1.05, pad=sz//8)
model_data = ImageClassifierData.from_paths(PATH, bs=bs, tfms=tfms, num_workers=4)

model_data = get_data(sz,bs)

darknet53 = darknet.darknet_53()

In [15]:
learner = ConvLearner.from_model_data(darknet53, model_data)
learner.crit = F.nll_loss

In [ ]:
learner.lr_find()


  2%|▏         | 8/414 [00:40<33:53,  5.01s/it, loss=0.0605]  

V. Sanity Check: Replicating Darknet53

Making sure I can get similar performance to Sylvain Gugger's implementation of Darknet53.


In [6]:
# list((PATH/'train').iterdir())

In [22]:
L = list((PATH/'train').iterdir())
L1 = L[1].glob('*')
# list(L1)

In [23]:
filenames, classes = [], []
TRN_PATH = (PATH/'train')
for directory in TRN_PATH.iterdir():
    for fn in directory.glob('*'):
        filenames.append(str(fn)[len(str(TRN_PATH))+1:])
        classes.append(str(directory)[len(str(TRN_PATH))+1:])
class_names = list(set(classes))
class2idx = {c:i for i,c in enumerate(class_names)}
labels = [class2idx[c] for c in classes]

In [24]:
df = pd.DataFrame({'filenames':filenames, 'cats':labels}, columns=['filenames', 'cats'])
df.head()


Out[24]:
filenames cats
0 n02894605/n02894605_44136.JPEG 233
1 n02894605/n02894605_70618.JPEG 233
2 n02894605/n02894605_57969.JPEG 233
3 n02894605/n02894605_57217.JPEG 233
4 n02894605/n02894605_26115.JPEG 233

In [25]:
df.to_csv(PATH/'train.csv', index=False)

In [11]:
# stats = (np.array([0.4855, 0.456, 0.406]), np.array([0.229, 0.224, 0.225]))

In [12]:
def get_data(sz,bs):
    tfms = tfms_from_model(resnet50,sz)
    return ImageClassifierData.from_csv(PATH,'train',PATH/'train.csv',bs=bs,tfms=tfms)

In [13]:
size = 256
batch_size = 16
data = get_data(size, batch_size)

Darknet


In [14]:
class ConvBN(nn.Module):
    # convolutional layer then BatchNorm
    def __init__(self, ch_in, ch_out, kernel_size=3, stride=1, padding=0):
        super().__init__()
        self.conv = nn.Conv2d(ch_in, ch_out, kernel_size=kernel_size, stride=stride,
                              padding=padding, bias=False)
        self.bn = nn.BatchNorm2d(ch_out, momentum=0.01)
    
    def forward(self, x):
        return F.leaky_relu(self.bn(self.conv(x)), negative_slope=0.1)
    
class DarknetBlock(nn.Module):
    # the basic blocks
    def __init__(self, ch_in):
        super().__init__()
        ch_hid = ch_in//2
        self.conv1 = ConvBN(ch_in, ch_hid, kernel_size=1, stride=1, padding=0)
        self.conv2 = ConvBN(ch_hid, ch_in, kernel_size=3, stride=1, padding=1)
        
    def forward(self, x):
        out = self.conv1(x)
        out = self.conv2(out)
        return out + x
    
class Darknet(nn.Module):
    # Replicates table 1 from the YOLOv3 paper
    def __init__(self, num_blocks, num_classes=1000):
        super().__init__()
        self.conv = ConvBN(3, 32, kernel_size=3, stride=1, padding=1)
        self.layer1 = self.make_group_layer(32, num_blocks[0])
        self.layer2 = self.make_group_layer(64, num_blocks[1], stride=2)
        self.layer3 = self.make_group_layer(128,num_blocks[2], stride=2)
        self.layer4 = self.make_group_layer(256,num_blocks[3], stride=2)
        self.layer5 = self.make_group_layer(512,num_blocks[4], stride=2)
        self.linear = nn.Linear(1024, num_classes)
        
    def make_group_layer(self, ch_in, num_blocks, stride=1):
        layers = [ConvBN(ch_in, ch_in*2, stride=stride)]
        for i in range(num_blocks):
            layers.append(DarknetBlock(ch_in*2))
        return nn.Sequential(*layers)
    
    def forward(self, x):
        out = self.conv(x)
        out = self.layer1(out)
        out = self.layer2(out)
        out = self.layer3(out)
        out = self.layer4(out)
        out = self.layer5(out)
        out = F.adaptive_avg_pool2d(out, 1)
        out = out.view(out.size(0), -1)
        return F.log_softmax(self.linear(out))

In [15]:
darknet53 = Darknet([1,2,8,8,4])

In [14]:
darknet53 = darknet.darknet_53()

In [15]:
learn = ConvLearner.from_model_data(darknet53, data)
learn.crit = F.nll_loss

In [16]:
# learn.summary()

In [17]:
learn.lr_find()


  3%|▎         | 24/828 [01:09<39:03,  2.91s/it, loss=0.033]   

In [18]:
learn.lr_find()


 25%|██▌       | 207/828 [10:22<31:06,  3.00s/it, loss=7]   
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-18-d81c6bd29d71> in <module>()
----> 1 learn.lr_find()

~/Kaukasos/FADL2/fastai/learner.py in lr_find(self, start_lr, end_lr, wds, linear, **kwargs)
    328         layer_opt = self.get_layer_opt(start_lr, wds)
    329         self.sched = LR_Finder(layer_opt, len(self.data.trn_dl), end_lr, linear=linear)
--> 330         self.fit_gen(self.model, self.data, layer_opt, 1, **kwargs)
    331         self.load('tmp')
    332 

~/Kaukasos/FADL2/fastai/learner.py in fit_gen(self, model, data, layer_opt, n_cycle, cycle_len, cycle_mult, cycle_save_name, best_save_name, use_clr, use_clr_beta, metrics, callbacks, use_wd_sched, norm_wds, wds_sched_mult, use_swa, swa_start, swa_eval_freq, **kwargs)
    232             metrics=metrics, callbacks=callbacks, reg_fn=self.reg_fn, clip=self.clip, fp16=self.fp16,
    233             swa_model=self.swa_model if use_swa else None, swa_start=swa_start,
--> 234             swa_eval_freq=swa_eval_freq, **kwargs)
    235 
    236     def get_layer_groups(self): return self.models.get_layer_groups()

~/Kaukasos/FADL2/fastai/model.py in fit(model, data, n_epochs, opt, crit, metrics, callbacks, stepper, swa_model, swa_start, swa_eval_freq, **kwargs)
    126             batch_num += 1
    127             for cb in callbacks: cb.on_batch_begin()
--> 128             loss = model_stepper.step(V(x),V(y), epoch)
    129             avg_loss = avg_loss * avg_mom + loss * (1-avg_mom)
    130             debias_loss = avg_loss / (1 - avg_mom**batch_num)

~/Kaukasos/FADL2/fastai/model.py in step(self, xs, y, epoch)
     53         if self.loss_scale != 1: assert(self.fp16); loss = loss*self.loss_scale
     54         if self.reg_fn: loss = self.reg_fn(output, xtra, raw_loss)
---> 55         loss.backward()
     56         if self.fp16: update_fp32_grads(self.fp32_params, self.m)
     57         if self.loss_scale != 1:

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/autograd/variable.py in backward(self, gradient, retain_graph, create_graph, retain_variables)
    165                 Variable.
    166         """
--> 167         torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
    168 
    169     def register_hook(self, hook):

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/autograd/__init__.py in backward(variables, grad_variables, retain_graph, create_graph, retain_variables)
     97 
     98     Variable._execution_engine.run_backward(
---> 99         variables, grad_variables, retain_graph)
    100 
    101 

KeyboardInterrupt: 

Va .DataLoader Preliminary Tests


In [6]:
## these parameters are universal
sz = 256
bs = 32

darknet53 = darknet.darknet_53()

In [7]:
## the .from_paths way (mine) with stock transforms
tfms = tfms_from_stats(imagenet_stats, sz)
model_data = ImageClassifierData.from_paths(PATH, bs=bs, tfms=tfms, val_name='train')

In [7]:
## the .from_csv way with stock transforms
tfms = tfms_from_model(resnet50, sz)
model_data = ImageClassifierData.from_csv(PATH, 'train', PATH/'train.csv', 
                                          bs=bs, tfms=tfms)

.from_paths


In [8]:
learner = ConvLearner.from_model_data(darknet53, model_data)
learner.lr_find()
learner.sched.plot()


 12%|█▏        | 74/608 [05:26<39:13,  4.41s/it, loss=0.0239]  

.from_csv


In [8]:
learner = ConvLearner.from_model_data(darknet53, model_data)
learner.lr_find()
learner.sched.plot()


  4%|▍         | 16/414 [01:17<32:12,  4.86s/it, loss=-0.000439]
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-8-04cc713e2f49> in <module>()
      1 learner = ConvLearner.from_model_data(darknet53, model_data)
----> 2 learner.lr_find()
      3 learner.sched.plot()

~/Kaukasos/FADL2/fastai/learner.py in lr_find(self, start_lr, end_lr, wds, linear, **kwargs)
    328         layer_opt = self.get_layer_opt(start_lr, wds)
    329         self.sched = LR_Finder(layer_opt, len(self.data.trn_dl), end_lr, linear=linear)
--> 330         self.fit_gen(self.model, self.data, layer_opt, 1, **kwargs)
    331         self.load('tmp')
    332 

~/Kaukasos/FADL2/fastai/learner.py in fit_gen(self, model, data, layer_opt, n_cycle, cycle_len, cycle_mult, cycle_save_name, best_save_name, use_clr, use_clr_beta, metrics, callbacks, use_wd_sched, norm_wds, wds_sched_mult, use_swa, swa_start, swa_eval_freq, **kwargs)
    232             metrics=metrics, callbacks=callbacks, reg_fn=self.reg_fn, clip=self.clip, fp16=self.fp16,
    233             swa_model=self.swa_model if use_swa else None, swa_start=swa_start,
--> 234             swa_eval_freq=swa_eval_freq, **kwargs)
    235 
    236     def get_layer_groups(self): return self.models.get_layer_groups()

~/Kaukasos/FADL2/fastai/model.py in fit(model, data, n_epochs, opt, crit, metrics, callbacks, stepper, swa_model, swa_start, swa_eval_freq, **kwargs)
    126             batch_num += 1
    127             for cb in callbacks: cb.on_batch_begin()
--> 128             loss = model_stepper.step(V(x),V(y), epoch)
    129             avg_loss = avg_loss * avg_mom + loss * (1-avg_mom)
    130             debias_loss = avg_loss / (1 - avg_mom**batch_num)

~/Kaukasos/FADL2/fastai/model.py in step(self, xs, y, epoch)
     63             copy_fp32_to_model(self.m, self.fp32_params)
     64             torch.cuda.synchronize()
---> 65         return torch_item(raw_loss.data)
     66 
     67     def evaluate(self, xs, y):

~/Kaukasos/FADL2/fastai/model.py in torch_item(x)
     27         if res is not None: return res
     28 
---> 29 def torch_item(x): return x.item() if hasattr(x,'item') else x[0]
     30 
     31 class Stepper():

KeyboardInterrupt: 

.from_paths starts at -0.0007 or so, reaches above 0 within 3'ish iterations, and runs at around 5s/it.

.from_csv starts at -0.00366, dips to -0.0213 at iter 4, and prompty gets back up above 0, dipping below again at iter 7 & going back above at 8. Runs at around 5s/it.

They both seem just about the same, when transforms are set to stock.

However, I need to know why number of iterations was out of 608 for .from_paths and 414 for .from_csv. First I need to ensure the dataset size is equal for both DataLoader methods.

--> right, I answered that question - was silly. The procedure above built the CSV from files in the training directory -- which was after I created the validation set. So it had only read the training set files, and not the full dataset. Meaning also: if I hadn't reset the validation set before running the test above, both methods (.from_csv and .from_paths) would have had the same number of total iterations (414 in that case).


In [11]:
## dataset size for .from_paths:
count_files(PATH/'train')


Out[11]:
19439

In [26]:
## dataset size for .from_csv:
df = pd.read_csv(PATH/'train.csv')
df.items


Out[26]:
<bound method DataFrame.iteritems of                              filenames  cats
0       n02894605/n02894605_44136.JPEG   233
1       n02894605/n02894605_70618.JPEG   233
2       n02894605/n02894605_57969.JPEG   233
3       n02894605/n02894605_57217.JPEG   233
4       n02894605/n02894605_26115.JPEG   233
5       n02894605/n02894605_79044.JPEG   233
6       n02894605/n02894605_36918.JPEG   233
7       n02894605/n02894605_74409.JPEG   233
8       n02894605/n02894605_57337.JPEG   233
9       n02894605/n02894605_26864.JPEG   233
10      n02894605/n02894605_15211.JPEG   233
11      n02894605/n02894605_61478.JPEG   233
12     n02894605/n02894605_103868.JPEG   233
13      n02894605/n02894605_86482.JPEG   233
14      n02894605/n02894605_25816.JPEG   233
15     n02894605/n02894605_121711.JPEG   233
16      n02894605/n02894605_29318.JPEG   233
17      n02894605/n02894605_24721.JPEG   233
18      n02894605/n02894605_51257.JPEG   233
19      n02894605/n02894605_29824.JPEG   233
20      n02894605/n02894605_13652.JPEG   233
21      n02894605/n02894605_34440.JPEG   233
22      n02894605/n02894605_57537.JPEG   233
23       n02894605/n02894605_2436.JPEG   233
24      n02894605/n02894605_64529.JPEG   233
25      n02894605/n02894605_36723.JPEG   233
26      n02894605/n02894605_24755.JPEG   233
27       n02007558/n02007558_4964.JPEG   504
28      n02007558/n02007558_18192.JPEG   504
29       n02007558/n02007558_1510.JPEG   504
...                                ...   ...
19409    n03452741/n03452741_2717.JPEG    26
19410   n03452741/n03452741_20938.JPEG    26
19411    n03452741/n03452741_9609.JPEG    26
19412    n03452741/n03452741_8472.JPEG    26
19413   n03594945/n03594945_32810.JPEG   412
19414    n03594945/n03594945_9792.JPEG   412
19415    n03594945/n03594945_6164.JPEG   412
19416    n03594945/n03594945_7735.JPEG   412
19417   n03594945/n03594945_22857.JPEG   412
19418   n03594945/n03594945_21879.JPEG   412
19419    n03594945/n03594945_4958.JPEG   412
19420   n03594945/n03594945_26347.JPEG   412
19421    n03594945/n03594945_7190.JPEG   412
19422   n03594945/n03594945_14488.JPEG   412
19423   n03594945/n03594945_28036.JPEG   412
19424    n03594945/n03594945_6553.JPEG   412
19425    n03594945/n03594945_3488.JPEG   412
19426   n03594945/n03594945_10474.JPEG   412
19427   n03594945/n03594945_27335.JPEG   412
19428   n03594945/n03594945_11422.JPEG   412
19429   n03594945/n03594945_18236.JPEG   412
19430     n03594945/n03594945_912.JPEG   412
19431   n03594945/n03594945_17650.JPEG   412
19432   n03594945/n03594945_15544.JPEG   412
19433   n03594945/n03594945_11943.JPEG   412
19434    n03594945/n03594945_8668.JPEG   412
19435   n03594945/n03594945_32070.JPEG   412
19436   n03594945/n03594945_15700.JPEG   412
19437    n03594945/n03594945_5563.JPEG   412
19438   n03594945/n03594945_27010.JPEG   412

[19439 rows x 2 columns]>

So now that I'm getting the same performance - as a sanity check that I'm not doing something very wrong - the only question that remains now is: what is the effect of different transforms on the Learner's ability to train. Am I breaking things by using wrong / bad transforms? Let's find out.

Vb. DataLoader Tests:

Checking transforms_from_stats vs transforms_from_model

tfms_from_stats(imagenet_stats) == tfms_from_model(darknet53) == tfms_from_model(resnet50) ?


In [6]:
sz = 256
bs = 32
darknet53 = darknet.darknet_53()

In [7]:
# tfms = tfms_from_stats(imagenet_stats, sz)
# tfms = tfms_from_model(darknet53, sz)
tfms = tfms_from_model(resnet50, sz)

In [8]:
model_data = ImageClassifierData.from_paths(PATH, bs=bs, tfms=tfms, val_name='train')

In [9]:
learner = ConvLearner.from_model_data(darknet53, model_data)

tfms = tfms_from_stats(imagenet_stats, sz):

Worked after about 2 failures.


In [10]:
learner.lr_find()
learner.sched.plot()


 13%|█▎        | 77/608 [05:39<39:00,  4.41s/it, loss=0.0117]  

tfms = tfms_from_model(darknet53, sz):

Worked after about 2 failures.


In [10]:
learner.lr_find()
learner.sched.plot()


  8%|▊         | 49/608 [03:37<41:24,  4.44s/it, loss=0.0326]  

tfms = tfms_from_model(resnet50, sz):

After 4 failures: the 1st plot. Subsequent try resulted in a similar 'low-res' plot. 1 more failture, then the 2nd plot was created.


In [10]:
learner.lr_find()
learner.sched.plot()


  3%|▎         | 20/608 [01:38<48:22,  4.94s/it, loss=0.0251]  

In [10]:
learner.lr_find()
learner.sched.plot()


 23%|██▎       | 137/608 [10:36<36:28,  4.65s/it, loss=-0.00101] 

Vc. Cross Entropy Loss


In [6]:
sz = 256
bs = 32
darknet53 = darknet.darknet_53()

In [7]:
# tfms = tfms_from_stats(imagenet_stats, sz)
tfms = tfms_from_model(darknet53, sz)

In [8]:
model_data = ImageClassifierData.from_paths(PATH, bs=bs, tfms=tfms, val_name='train')

In [9]:
# learner = ConvLearner.from_model_data(darknet53, model_data)
learner = ConvLearner.from_model_data(darknet53, model_data, crit=F.cross_entropy)

In [10]:
learner.crit


Out[10]:
<function torch.nn.functional.cross_entropy(input, target, weight=None, size_average=True, ignore_index=-100, reduce=True)>

tfms = tfms_from_model(darknet53, sz):


In [11]:
learner.lr_find()
learner.sched.plot()


 93%|█████████▎| 565/608 [43:49<03:20,  4.65s/it, loss=47.4]

Right, so it turns out that the loss function was automatically being set to NLL Loss instead of CE Loss (Negative-Log Likelihood; Cross Entropy (combines Log-Softmax & NLL Loss)). Setting it to cross entropy fixed all issues immediately. Thanks to SGugger.

So as a final sanity check: I'll run the learning rate finder again, this time with all my transform hyparameters used -- just to make sure it works.

Then train on a the ImageNet sampleset using Cyclical Learning Rates.

Vd. Sanity Check: Cross Entropy w/ extra Data Augmentation

Since I'm just checking that the Learning Rate Finder will run on this, I'm point the validation set at the training folder (to avoid an error - since I'm not actually training right now).


In [14]:
bs = 32
sz = 256

darknet53 = darknet.darknet_53()

tfms = tfms_from_stats(imagenet_stats, sz, aug_tfms=transforms_side_on, 
                       max_zoom=1.05, pad=sz//8)
model_data = ImageClassifierData.from_paths(PATH, bs=bs, tfms=tfms, val_name='train')

learner = ConvLearner.from_model_data(darknet53, model_data, crit=F.cross_entropy)

In [7]:
learner.lr_find()
learner.sched.plot()


 92%|█████████▏| 559/608 [41:30<03:38,  4.46s/it, loss=34.6]

Misc: Viewing Augmentations:

Viewing augmentations (adapted from fastai DL1 Lesson1):


In [64]:
def get_augs(iters=1):
    data = ImageClassifierData.from_paths(PATH, bs=2, tfms=tfms, num_workers=1, val_name='train')
    data_iter = iter(data.aug_dl)
    for i in range(iters):
        x,_ = next(data_iter)
    return data.trn_ds.denorm(x)[1]
def plots(ims, figsize=(12,6), rows=1, titles=None):
    f = plt.figure(figsize=figsize)
    for i in range(len(ims)):
        sp = f.add_subplot(rows, len(ims)//rows, i+1)
        sp.axis('Off')
        if titles is not None: sp.set_title(titles[i], fontsize=16)
        plt.imshow(ims[i])

In [65]:
ims = np.stack([get_augs(20) for i in range(6)])
plots(ims, rows=2)


I checked other pictures - they look fine. This one looks tough though.. Hopefully it's not too much of an issue for the model to train -- with so much of the fish being out of frame sometimes.

At this point, I'm ready to get started with the model. I'll use Cyclical Learning Rates (Leslie Smith, 2015 arχiv: 1506.01186). From what I understand from the paper and from the fast.ai forums, I'll run it for a cycle_len of 3 epochs.

End


In [ ]: