State Farm Logbook

Before 2/25/2017

First I created 2 notebooks

setup.ipynb - for moving data around & setting up data/train, data/valid, etc.
roger.ipynb - to experiment with vgg16bn as a first attempt

I didn't have good success with this due to

  • setup did not select individual drivers, just random images
  • hadn't looked at Jeremy's successful runs.

I just got 10% accuracy over & over.

Next

I adjusted setup.ipynb to separate images by people. I looked & noticed men & women and thought I should ensure both were represented. I did 3 men + 2 women = 5 in validation set. I didn't notice much improvement.

Reading forums, I decided to try harder to replicate Jeremy's results.

2/25/2017

Need to find out how I'm training things currently. So, I created histogram & confusion matrix plots. Can now see that it just picks 1 or 2 categories.

2/26/2017

Try only 3 drivers in validation set. I'm using 5. Jeremy said in the forum "I'm using 3" so let's go with that. Another pass through setup.ipynb to acccomplish this.

Also figured out that I could use histogram std deviation as a single scalar metric of goodness for making sure it does not focus on only 1 or two images.

Made a new script statefarm.py with my skeleton code from https://github.com/rogerallen/pyskel so that I could pass commandline options & shmoo across various parameters.

shmoo-ed the learning rate via some shmoo0.sh and shmoo1.sh scripts

!egrep 'std|histo' shmoo0.log

1e-1 std 807.678804
     histo [   0    0 2698   57    0    0    0    0    0    0]
1e-2 std 646.498917
     histo [ 429    0  120   29    0    0 2177    0    0    0]
1e-3 std 702.191035
     histo [2377    0    0  101  126    9  114   19    0    9]
1e-4 std 335.245358
     histo [277 841  27   5 176  67 993  59  43 267]
1e-5 std 288.689193
     histo [ 29 613  20 148  94 771  67 208 733  72]
1e-6 std 213.648894
     histo [276 157 411 417  88 395  62 166  24 759]
1e-7 std 301.634298
     histo [ 26  76  54 157 177   1 744 584 100 836]
1e-8 std 374.224598
     histo [ 103   15    3  305  454  297   53    4 1309  212]
1e-9 std 420.059103
     histo [ 406   53   10   21 1427    1  313  467   54    3]

So, you can see 1e-4 to 1e-7 looks good. Shmoo-ing some more...

!egrep 'run_model|std|histo' shmoo1*.log

1e-5 shmoo1.log:run_model N:128 M:128 p:0.800000 lr0:0.000010 ep0:3
     shmoo1.log:std 193.912996
     shmoo1.log:histo [301 558 211  10 135 491 432 147 470   0]
shmoo1.log:run_model N:128 M:128 p:0.800000 lr0:0.000005 ep0:3
shmoo1.log:std 447.512067
shmoo1.log:histo [   5  107  768   13  203    1  126 1448    4   80]
shmoo1.log:run_model N:128 M:128 p:0.800000 lr0:0.000001 ep0:3
shmoo1.log:std 382.288438
shmoo1.log:histo [  23    7  329  325  720   38 1217   29   65    2]
shmoo1.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:3
shmoo1.log:std 321.579928
shmoo1.log:histo [ 178   22  138 1113  262  120   66  253  593   10]
shmoo1.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:3
shmoo1.log:std 316.996293
shmoo1.log:histo [ 33  30  62   7 385 959 734 310   0 235]

shmoo1a.log:run_model N:128 M:128 p:0.800000 lr0:0.000010 ep0:9
shmoo1a.log:std 355.040068
shmoo1a.log:histo [   9   20   88   12  681 1182  171  208  147  237]
shmoo1a.log:run_model N:128 M:128 p:0.800000 lr0:0.000005 ep0:9
shmoo1a.log:std 410.171489
shmoo1a.log:histo [ 132  226   19  259   73    0 1362  667    3   14]
shmoo1a.log:run_model N:128 M:128 p:0.800000 lr0:0.000001 ep0:9
shmoo1a.log:std 346.864599
shmoo1a.log:histo [  40  219   18  518  310   21   27  407 1175   20]
5e-7 shmoo1a.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:9
     shmoo1a.log:std 256.663301
     shmoo1a.log:histo [821 430  44 407 471  19   1 368  11 183]
shmoo1a.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:9
shmoo1a.log:std 211.431904
shmoo1a.log:histo [457 448 573  84 596  85   2 258 139 113]

shmoo1b.log:run_model N:128 M:128 p:0.800000 lr0:0.000001 ep0:9
shmoo1b.log:std 385.172753
shmoo1b.log:histo [ 453  562  100  143    6    0   66  132    1 1292]
shmoo1b.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:9
shmoo1b.log:std 373.252796
shmoo1b.log:histo [  37    8    1   17 1223  203   13  510  175  568]
1e-7 shmoo1b.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:9
     shmoo1b.log:std 240.348601
     shmoo1b.log:histo [651  24  36 436 233 588 141  72  31 543]
shmoo1b.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:9
shmoo1b.log:std 354.691486
shmoo1b.log:histo [  84    0    0   68  492  521  150  103  137 1200]
shmoo1b.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:9
shmoo1b.log:std 379.643583
shmoo1b.log:histo [ 128  100  386    2   79   23   99   12  686 1240]

1e-6 shmoo1b1.log:run_model N:128 M:128 p:0.800000 lr0:0.000001 ep0:9
     shmoo1b1.log:std 277.625737
     shmoo1b1.log:histo [110  41  21 366 244 662 868  21  68 354]
shmoo1b1.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:9
shmoo1b1.log:std 436.321957
shmoo1b1.log:histo [ 657    5 1462  143  140   27   99    0   15  207]
shmoo1b1.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:9
shmoo1b1.log:std 336.395080
shmoo1b1.log:histo [175 387 928  17  18 896  98  34 186  16]
5e-8 shmoo1b1.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:9
     shmoo1b1.log:std 282.525840
     shmoo1b1.log:histo [ 86 185 650  56 947 250 327  54  80 120]
shmoo1b1.log:run_model N:128 M:128 p:0.800000 lr0:0.000000 ep0:9
shmoo1b1.log:std 369.717798
shmoo1b1.log:histo [ 365  397  118  248   56  111   49   85    9 1317]

My takeaway here is that about 1e-6 + 5 iterations + looking for a val_loss < 2.5 would be a good start. Let's try for that...

Trying a variety of things like...

gyre 15:11:36 statefarm> ./statefarm.py -v --ep0 5 --lr 5e-7 --label a04 | tee try_a04.log                                                                                     
Using gpu device 0: GeForce GTX 1070 (CNMeM is disabled, cuDNN 5105)
/home/rallen/anaconda2/lib/python2.7/site-packages/theano/sandbox/cuda/__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.
  warnings.warn(warn)
Using Theano backend.
INFO:root:Options: {'dropout_rate': 0.8, 'verbose': 1, 'M': 128, 'ep0': 5, 'label': 'a04', 'lr0': 5e-07, 'N': 128}, Args: []
Found 19669 images belonging to 10 classes.
Found 2755 images belonging to 10 classes.
Found 79726 images belonging to 1 classes.
Found 2755 images belonging to 10 classes.
run_model N:128 M:128 p:0.800000 lr0:5e-07 ep0:5 label:a04
Train on 19669 samples, validate on 2755 samples
Epoch 1/5
19669/19669 [==============================] - 6s - loss: 5.0120 - acc: 0.0956 - val_loss: 2.6358 - val_acc: 0.0897
Epoch 2/5
19669/19669 [==============================] - 5s - loss: 4.9335 - acc: 0.1061 - val_loss: 2.6586 - val_acc: 0.0918
Epoch 3/5
19669/19669 [==============================] - 5s - loss: 4.9904 - acc: 0.1019 - val_loss: 2.6619 - val_acc: 0.0918
Epoch 4/5
19669/19669 [==============================] - 6s - loss: 4.9753 - acc: 0.1015 - val_loss: 2.6590 - val_acc: 0.0907
Epoch 5/5
19669/19669 [==============================] - 5s - loss: 4.9320 - acc: 0.1032 - val_loss: 2.6596 - val_acc: 0.0918
std 261.467
histo [770  20   3   2  40 562 157 457 268 476]
[[117   1   0   0   0  30   3  61  38  41]
 [ 34   0   0   0   3 119  12  12  63  43]
 [ 90   1   0   0  10  56   4   7  35  80]
 [ 42   6   0   0   0  35  35  68  28  67]
 [ 78   9   0   0   0  27  55  70   7  43]
 [ 63   0   2   0  15  57   1  44  16  89]
 [ 90   0   0   0   9  26  13  71   4  64]
 [ 53   1   1   0   1  99   6  27  32  19]
 [ 84   2   0   2   2  53  17  41  28  19]
 [119   0   0   0   0  60  11  56  17  11]]
saving weights_a04.h5

Okay, let's try the next step with this one:

gyre 15:15:11 statefarm> ./statefarm.py -v --ep0 5 --lr 1e-7 --label a07 | tee try_a07.log Using gpu device 0: GeForce GTX 1070 (CNMeM is disabled, cuDNN 5105) /home/rallen/anaconda2/lib/python2.7/site-packages/theano/sandbox/cuda/__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5. warnings.warn(warn) Using Theano backend. INFO:root:Options: {'dropout_rate': 0.8, 'verbose': 1, 'M': 128, 'ep0': 5, 'label': 'a07', 'lr0': 1e-07, 'N': 128}, Args: [] Found 19669 images belonging to 10 classes. Found 2755 images belonging to 10 classes. Found 79726 images belonging to 1 classes. Found 2755 images belonging to 10 classes. run_model N:128 M:128 p:0.800000 lr0:1e-07 ep0:5 label:a07 Train on 19669 samples, validate on 2755 samples Epoch 1/5 19669/19669 [==============================] - 6s - loss: 5.0160 - acc: 0.1014 - val_loss: 2.6098 - val_acc: 0.0846 Epoch 2/5 19669/19669 [==============================] - 5s - loss: 5.0217 - acc: 0.0988 - val_loss: 2.6231 - val_acc: 0.0846 Epoch 3/5 19669/19669 [==============================] - 5s - loss: 5.0294 - acc: 0.0994 - val_loss: 2.6231 - val_acc: 0.0864 Epoch 4/5 19669/19669 [==============================] - 5s - loss: 4.9833 - acc: 0.1005 - val_loss: 2.6224 - val_acc: 0.0838 Epoch 5/5 19669/19669 [==============================] - 5s - loss: 5.0520 - acc: 0.0993 - val_loss: 2.6207 - val_acc: 0.0846 std 165.345 histo [163 486 274 207 345 187 634 174 254 31] [[ 29 50 6 10 26 77 60 10 21 2] [ 0 22 14 8 98 33 77 16 17 1] [ 0 61 10 15 1 16 100 1 76 3] [ 63 29 23 34 67 14 4 16 18 13] [ 23 41 14 58 19 14 50 66 2 2] [ 7 82 25 15 72 0 63 20 3 0] [ 3 36 27 7 46 6 105 15 32 0] [ 6 73 20 23 9 1 68 6 31 2] [ 8 86 82 9 2 14 34 3 5 5] [ 24 6 53 28 5 12 73 21 49 3]] saving weights_a07.h5

Been doing this.

 2577  2017-02-26 15:07:17 ./statefarm.py -v --ep0 5 --lr 1e-6 --label a00 | tee try_a00.log
 2578  2017-02-26 15:08:18 ./statefarm.py -v --ep0 5 --lr 1e-6 --label a01 | tee try_a01.log
 2579  2017-02-26 15:09:51 ./statefarm.py -v --ep0 5 --lr 1e-5 --label a02 | tee try_a02.log
 2580  2017-02-26 15:10:48 ./statefarm.py -v --ep0 5 --lr 1e-6 --label a03 | tee try_a03.log
 2581  2017-02-26 15:12:09 ./statefarm.py -v --ep0 5 --lr 5e-7 --label a04 | tee try_a04.log
 2582  2017-02-26 15:13:17 ./statefarm.py -v --ep0 5 --lr 5e-7 --label a05 | tee try_a05.log
 2583  2017-02-26 15:14:24 ./statefarm.py -v --ep0 5 --lr 1e-8 --label a06 | tee try_a06.log
 2584  2017-02-26 15:15:23 ./statefarm.py -v --ep0 5 --lr 1e-7 --label a07 | tee try_a07.log
 2585  2017-02-26 15:22:32 ./statefarm.py -v --ep0 5 --lr 1e-3 --load a07 --label b01 | tee try_b01.log
 2586  2017-02-26 15:23:52 ./statefarm.py -v --ep0 5 --lr 1e-4 --load a07 --label b02 | tee try_b02.log
 2587  2017-02-26 15:24:58 ./statefarm.py -v --ep0 5 --lr 1e-5 --load a07 --label b03 | tee try_b03.log
 2588  2017-02-26 15:26:29 ./statefarm.py -v --ep0 30 --lr 1e-5 --load b03 --label c00 | tee try_c00.log
loss: 4.7181 - acc: 0.1045 - val_loss: 2.5546 - val_acc: 0.0933
loss: 3.4427 - acc: 0.1141 - val_loss: 2.3020 - val_acc: 0.1064
std 254.784
histo [ 30 372 250 495  96 123 891  51 366  81]
[[  0  16  33  70   9  34  83   1  39   6]
 [  0  46   3  41  60   7 107   2  13   7]
 [  0  13  23  30   3  33 113   1  67   0]
 [  3  36  14 104   8  17  44   3  27  25]
 [ 16   8  60  38   0  20  63  18  39  27]
 [  4  43  35  76   6   1  97   5  18   2]
 [  0 138  14  20   9   2  71   5  10   8]
 [  0  24  28  34   0   2  93  12  46   0]
 [  0  41  39  60   1   5  69   3  30   0]
 [  7   7   1  22   0   2 151   1  77   6]]

 2589  2017-02-26 15:30:22 ./statefarm.py -v --ep0 30 --lr 1e-5 --load c00 --label d00 | tee try_d00.log
loss: 3.4031 - acc: 0.1159 - val_loss: 2.3010 - val_acc: 0.1147
loss: 2.9054 - acc: 0.1190 - val_loss: 2.2935 - val_acc: 0.1009
std 390.517
histo [  51  259   46  377   86  111 1388   98    1  338]
[[  0  20  11  60   7  47 128   0   0  18]
 [  0  28   0  61  28   3 153   7   0   6]
 [  0  31  18  12   1  14 161  44   0   2]
 [ 35  51   0  25   9   6  90   6   0  59]
 [ 11  13   9   9   6  25  97   8   0 111]
 [  2  30   0  37  18   3 138   0   0  59]
 [  0  26   1  44  13   2 156   6   0  29]
 [  0  29   2  35   0   4 133  24   0  12]
 [  0  28   5  21   4   4 158   2   1  25]
 [  3   3   0  73   0   3 174   1   0  17]]

 2590  2017-02-26 15:34:46 ./statefarm.py -v --ep0 30 --lr 5e-5 --load d00 --label e00 | tee try_e00.log
loss: 2.8726 - acc: 0.1159 - val_loss: 2.2992 - val_acc: 0.1256
loss: 2.3973 - acc: 0.1738 - val_loss: 2.3018 - val_acc: 0.0755
std 298.178
histo [664  57 115 439  37 915  58  84   0 386]
[[ 16   4  42  29  20 158   2   1   0  19]
 [ 43   1   5  66   0  86   7   8   0  70]
 [ 30  10  17   7  14 149  13  31   0  12]
 [144   0   0  55   0  76   0   1   0   5]
 [154  16   0  21   0  84   0   0   0  14]
 [ 32   1   2 134   1  30   1  25   0  61]
 [ 97   9  11  15   1  82  14   2   0  46]
 [ 74   8  25  10   0  92   4   4   0  22]
 [ 11   6  13  64   0  84   3   1   0  66]
 [ 63   2   0  38   1  74  14  11   0  71]]

I'm able to overfit (0.7+). But validation is awful (0.1) still. Still not seeing how to get validation going.

 2592  2017-02-26 22:21:25 ./statefarm.py -v --ep0 30 --lr 1e-5 --load d00 --label e01 | tee try_e01.log
 2593  2017-02-26 22:25:33 ./statefarm.py -v --ep0 30 --lr 1e-3 --load d00 --label e02 | tee try_e02.log
 2594  2017-02-26 22:29:43 ./statefarm.py -v --ep0 30 --lr 1e-3 --load e02 --label f00 | tee try_f00.log
 2595  2017-02-26 22:33:31 ./statefarm.py -v --ep0 30 --lr 1e-3 --load f00 --label g00 | tee try_g00.log
 2596  2017-02-26 22:37:42 ./statefarm.py -v --ep0 30 --lr 1e-3 --load g00 --label h00 | tee try_h00.log
 2597  2017-02-26 22:41:17 ./statefarm.py -v --ep0 30 --lr 1e-3 --load h00 --label i00 | tee try_i00.log
 2598  2017-02-26 22:45:34 ./statefarm.py -v --ep0 30 --lr 1e-3 --load i00 --label j00 | tee try_j00.log
 2599  2017-02-26 22:49:18 ./statefarm.py -v --ep0 30 --lr 1e-3 --load j00 --label k00 | tee try_k00.log

loss: 0.6906 - acc: 0.7746 - val_loss: 3.0194 - val_acc: 0.0918
std 196.26
histo [235  75  25 393 172 335 663 437  26 394]
[[  0  15   2  52  21  41  75  48   6  31]
 [ 34  18   0  87   8  51  58   5   1  24]
 [  5   1   0   8  13  44  56  79   1  76]
 [ 52   6   0  19  14  43  66  29   2  50]
 [ 69   6   1   4  13   3  78  43   0  72]
 [  7   1   1  75  19   8  87  76   0  13]
 [ 11   5   0  18  13  16 126  44  12  32]
 [ 25   9  16  31  13  58  45  30   4   8]
 [  2  10   1  72  47  19  31  17   0  49]
 [ 30   4   4  27  11  52  41  66   0  39]]

2/27/2017

Let's try to recreate some of his success with "simple" non-VGG models from the first part of his statefarm.ipynb. Indeed, I'm getting great results (100% test acc, 57% valid acc) with the conv1() example code 2 3x3 CNN layers (no VGG at all--wtf?) `

I must have some mistake in my validation features or labels...

Okay, recreating my .dat files. This time no data augmentation at all.

trying

./statefarm.py -v --ep0 5 --lr 1e-6 --save 0227_0 | tee logs/try_0227.log
...
5s - loss: 5.0216 - acc: 0.1013 - val_loss: 2.6151 - val_acc: 0.0595
std 203.133
histo [184 304 484  30 326 307 482 614   6  18]
[[ 13  51  43   5  16  15  32 115   1   0]
 [ 15   2  95   0  26   9  60  79   0   0]
 [  3  22  31   2   5  90  73  57   0   0]
 [  0  19  78  10  33  34  31  61   0  15]
 [  1  41  77   6  35  15  55  55   1   3]
 [ 23 110  19   3  20  16  34  62   0   0]
 [ 57  24  16   0  76  43  31  30   0   0]
 [ 44   4  33   0  13  31  89  25   0   0]
 [ 22  15  63   3  12   8  37  87   1   0]
 [  6  16  29   1  90  46  40  43   3   0]]
saving weights_0227_0.h5

learned of verbose=2 in the fit() function and stdbuf

stdbuf -o 0 ./statefarm.py -v --ep0 20 --lr 1e-5 --load 0227_0 --save 0227_01 | tee logs/try_0227_01.log

 2774  2017-02-27 23:24:24 stdbuf -o 0 ./statefarm.py -v --ep0 20 --lr 1e-7 --save 0227_2 | tee logs/try_0227_2.log
 2775  2017-02-27 23:27:30 stdbuf -o 0 ./statefarm.py -v --ep0 20 --lr 1e-5 --load 0227_2 --save 0227_20 | tee logs/try_0227_20.log
 2776  2017-02-27 23:30:34 stdbuf -o 0 ./statefarm.py -v --ep0 20 --lr 1e-3 --load 0227_2 --save 0227_21 | tee logs/try_0227_21.log
 2777  2017-02-27 23:33:41 stdbuf -o 0 ./statefarm.py -v --ep0 20 --lr 1e-3 --load 0227_21 --save 0227_210 | tee logs/try_0227_210.log
 2778  2017-02-27 23:36:37 stdbuf -o 0 ./statefarm.py -v --ep0 20 --lr 1e-3 --load 0227_210 --save 0227_2100 | tee logs/try_0227_2100.log
 2779  2017-02-27 23:40:01 stdbuf -o 0 ./statefarm.py -v --ep0 50 --lr 1e-3 --load 0227_2100 --save 0227_21000 | tee logs/try_0227_21000.log

5s - loss: 1.6122 - acc: 0.4488 - val_loss: 2.9105 - val_acc: 0.0926
std 198.532
histo [211  47 298 272 541 680 327  43 276  60]
[[ 11   7  55  34  70  63  10   3  38   0]
 [ 38  10  45   2 105  59   5   2  11   9]
 [  5   0   8  82  57  69  30  18   3  11]
 [ 22   2  42  44   6 115  36   1  13   0]
 [  5   3  11  28  60 105  54   0  23   0]
 [ 12   1  43  29  94  79  21   3   5   0]
 [ 32   1  43  11  35  27  12   6  80  30]
 [ 50   6  10  10  12  69  45   4  30   3]
 [ 24   1  20   7  52  39  69   4  26   6]
 [ 12  16  21  25  50  55  45   2  47   1]]
saving weights_0227_21000.h5

Seems to be creating the same end-result. over-fitting without moving the validation accuracy which never moves very far away from 10%.

TODO

  • From forums: The best approach would be to create a bunch of models, each time holding out one driver, and then average the validation across all of them (you could also average the predictions across all of them, like the ensembling we did in MNIST last week!) But I wouldn't bother with that until you had done all the experimenting you wanted to do, since it adds a lot of time to each experiment.
  • See twairball jerry liu response from Feb in http://forums.fast.ai/t/statefarm-kaggle-comp/183/111 lots of good insight on different ideas to try.

In [ ]: