Enter State Farm


In [1]:
from __future__ import division, print_function
%matplotlib inline
# path = "data/state/"
path = "data/state/sample/"
from importlib import reload  # Python 3
import utils; reload(utils)
from utils import *
from IPython.display import FileLink


Using cuDNN version 6021 on context None
Mapped name None to device cuda0: GeForce GTX TITAN X (0000:04:00.0)
Using Theano backend.

In [2]:
batch_size=64
#batch_size=1

Create sample

The following assumes you've already created your validation set - remember that the training and validation set should contain different drivers, as mentioned on the Kaggle competition page.


In [ ]:
%cd data/state

In [ ]:
%cd train

In [ ]:
%mkdir ../sample
%mkdir ../sample/train
%mkdir ../sample/valid

In [ ]:
for d in glob('c?'):
    os.mkdir('../sample/train/'+d)
    os.mkdir('../sample/valid/'+d)

In [ ]:
from shutil import copyfile

In [ ]:
g = glob('c?/*.jpg')
shuf = np.random.permutation(g)
for i in range(1500): copyfile(shuf[i], '../sample/train/' + shuf[i])

In [ ]:
%cd ../valid

In [ ]:
g = glob('c?/*.jpg')
shuf = np.random.permutation(g)
for i in range(1000): copyfile(shuf[i], '../sample/valid/' + shuf[i])

In [ ]:
%cd ../../../..

In [ ]:
%mkdir data/state/results

In [ ]:
%mkdir data/state/sample/test

Create batches


In [3]:
batches = get_batches(path+'train', batch_size=batch_size)
val_batches = get_batches(path+'valid', batch_size=batch_size*2, shuffle=False)


Found 1500 images belonging to 10 classes.
Found 1000 images belonging to 10 classes.

In [4]:
(val_classes, trn_classes, val_labels, trn_labels, val_filenames, filenames,
    test_filename) = get_classes(path)


Found 1500 images belonging to 10 classes.
Found 1000 images belonging to 10 classes.
Found 1000 images belonging to 1 classes.

In [5]:
steps_per_epoch = int(np.ceil(batches.samples/batch_size))
validation_steps = int(np.ceil(val_batches.samples/(batch_size*2)))

Basic models

Linear model

First, we try the simplest model and use default parameters. Note the trick of making the first layer a batchnorm layer - that way we don't have to worry about normalizing the input ourselves.


In [6]:
model = Sequential([
        BatchNormalization(axis=1, input_shape=(3,224,224)),
        Flatten(),
        Dense(10, activation='softmax')
    ])

As you can see below, this training is going nowhere...


In [7]:
model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
model.fit_generator(batches, steps_per_epoch, epochs=2, validation_data=val_batches, 
                 validation_steps=validation_steps)


Epoch 1/2
24/24 [==============================] - 13s 554ms/step - loss: 13.7545 - acc: 0.1150 - val_loss: 14.0601 - val_acc: 0.1250
Epoch 2/2
24/24 [==============================] - 7s 292ms/step - loss: 13.9919 - acc: 0.1261 - val_loss: 14.6476 - val_acc: 0.0900
Out[7]:
<keras.callbacks.History at 0x7fb21413aa90>

Let's first check the number of parameters to see that there's enough parameters to find some useful relationships:


In [8]:
model.summary()


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
batch_normalization_1 (Batch (None, 3, 224, 224)       12        
_________________________________________________________________
flatten_1 (Flatten)          (None, 150528)            0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1505290   
=================================================================
Total params: 1,505,302
Trainable params: 1,505,296
Non-trainable params: 6
_________________________________________________________________

Over 1.5 million parameters - that should be enough. Incidentally, it's worth checking you understand why this is the number of parameters in this layer:


In [9]:
10*3*224*224


Out[9]:
1505280

Since we have a simple model with no regularization and plenty of parameters, it seems most likely that our learning rate is too high. Perhaps it is jumping to a solution where it predicts one or two classes with high confidence, so that it can give a zero prediction to as many classes as possible - that's the best approach for a model that is no better than random, and there is likely to be where we would end up with a high learning rate. So let's check:


In [10]:
np.round(model.predict_generator(batches, int(np.ceil(batches.samples/batch_size)))[:10],2)


Out[10]:
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.]], dtype=float32)

Our hypothesis was correct. It's nearly always predicting class 1 or 6, with very high confidence. So let's try a lower learning rate:


In [11]:
model = Sequential([
        BatchNormalization(axis=1, input_shape=(3,224,224)),
        Flatten(),
        Dense(10, activation='softmax')
    ])
model.compile(Adam(lr=1e-5), loss='categorical_crossentropy', metrics=['accuracy'])
model.fit_generator(batches, steps_per_epoch, epochs=2, validation_data=val_batches, 
                 validation_steps=validation_steps)


Epoch 1/2
24/24 [==============================] - 10s 415ms/step - loss: 2.3957 - acc: 0.1707 - val_loss: 3.9526 - val_acc: 0.1670
Epoch 2/2
24/24 [==============================] - 7s 305ms/step - loss: 1.7573 - acc: 0.4270 - val_loss: 2.4417 - val_acc: 0.2990
Out[11]:
<keras.callbacks.History at 0x7fb1f5467be0>

Great - we found our way out of that hole... Now we can increase the learning rate and see where we can get to.


In [12]:
model.optimizer.lr=0.001

In [13]:
model.fit_generator(batches, steps_per_epoch, epochs=4, validation_data=val_batches, 
                 validation_steps=validation_steps)


Epoch 1/4
24/24 [==============================] - 10s 409ms/step - loss: 1.4028 - acc: 0.5811 - val_loss: 1.7973 - val_acc: 0.3920
Epoch 2/4
24/24 [==============================] - 7s 304ms/step - loss: 1.1988 - acc: 0.6606 - val_loss: 1.3711 - val_acc: 0.5520
Epoch 3/4
24/24 [==============================] - 7s 304ms/step - loss: 0.9956 - acc: 0.7669 - val_loss: 1.0795 - val_acc: 0.6790
Epoch 4/4
24/24 [==============================] - 7s 296ms/step - loss: 0.8564 - acc: 0.8254 - val_loss: 1.0032 - val_acc: 0.6910
Out[13]:
<keras.callbacks.History at 0x7fb1f58ca0b8>

We're stabilizing at validation accuracy of 0.39. Not great, but a lot better than random. Before moving on, let's check that our validation set on the sample is large enough that it gives consistent results:


In [14]:
rnd_batches = get_batches(path+'valid', batch_size=batch_size*2, shuffle=True)


Found 1000 images belonging to 10 classes.

In [15]:
val_res = [model.evaluate_generator(rnd_batches, int(np.ceil(rnd_batches.samples/(batch_size*2)))) for i in range(10)]
np.round(val_res, 2)


Out[15]:
array([[ 1.  ,  0.69],
       [ 1.  ,  0.69],
       [ 1.  ,  0.69],
       [ 1.  ,  0.69],
       [ 1.  ,  0.69],
       [ 1.  ,  0.69],
       [ 1.  ,  0.69],
       [ 1.  ,  0.69],
       [ 1.  ,  0.69],
       [ 1.  ,  0.69]])

Yup, pretty consistent - if we see improvements of 3% or more, it's probably not random, based on the above samples.

L2 regularization

The previous model is over-fitting a lot, but we can't use dropout since we only have one layer. We can try to decrease overfitting in our model by adding l2 regularization (i.e. add the sum of squares of the weights to our loss function):


In [16]:
model = Sequential([
        BatchNormalization(axis=1, input_shape=(3,224,224)),
        Flatten(),
        Dense(10, activation='softmax', kernel_regularizer=l2(0.01))
    ])
model.compile(Adam(lr=10e-5), loss='categorical_crossentropy', metrics=['accuracy'])
model.fit_generator(batches, steps_per_epoch, epochs=2, validation_data=val_batches, 
                 validation_steps=validation_steps)


Epoch 1/2
24/24 [==============================] - 10s 415ms/step - loss: 6.2808 - acc: 0.2777 - val_loss: 10.7599 - val_acc: 0.2000
Epoch 2/2
24/24 [==============================] - 7s 301ms/step - loss: 4.5643 - acc: 0.5124 - val_loss: 8.4116 - val_acc: 0.2180
Out[16]:
<keras.callbacks.History at 0x7fb1f49d7940>

In [17]:
model.optimizer.lr=0.001

In [18]:
model.fit_generator(batches, steps_per_epoch, epochs=4, validation_data=val_batches, 
                 validation_steps=validation_steps)


Epoch 1/4
24/24 [==============================] - 10s 411ms/step - loss: 3.5331 - acc: 0.6398 - val_loss: 4.6595 - val_acc: 0.4990
Epoch 2/4
24/24 [==============================] - 11s 454ms/step - loss: 3.0844 - acc: 0.7655 - val_loss: 3.7578 - val_acc: 0.6730
Epoch 3/4
24/24 [==============================] - 7s 302ms/step - loss: 3.0602 - acc: 0.8005 - val_loss: 3.6378 - val_acc: 0.6990
Epoch 4/4
24/24 [==============================] - 7s 297ms/step - loss: 2.9341 - acc: 0.8282 - val_loss: 3.6043 - val_acc: 0.7140
Out[18]:
<keras.callbacks.History at 0x7fb1f49024e0>

Looks like we can get a bit over 50% accuracy this way. This will be a good benchmark for our future models - if we can't beat 50%, then we're not even beating a linear model trained on a sample, so we'll know that's not a good approach.

Single hidden layer

The next simplest model is to add a single hidden layer.


In [19]:
model = Sequential([
        BatchNormalization(axis=1, input_shape=(3,224,224)),
        Flatten(),
        Dense(100, activation='relu'),
        BatchNormalization(),
        Dense(10, activation='softmax')
    ])
model.compile(Adam(lr=1e-5), loss='categorical_crossentropy', metrics=['accuracy'])
model.fit_generator(batches, steps_per_epoch, epochs=2, validation_data=val_batches, 
                 validation_steps=validation_steps)

model.optimizer.lr = 0.01
model.fit_generator(batches, steps_per_epoch, epochs=5, validation_data=val_batches, 
                 validation_steps=validation_steps)


Epoch 1/2
24/24 [==============================] - 10s 419ms/step - loss: 2.1258 - acc: 0.2935 - val_loss: 6.5839 - val_acc: 0.2170
Epoch 2/2
24/24 [==============================] - 7s 306ms/step - loss: 1.1776 - acc: 0.6712 - val_loss: 2.4076 - val_acc: 0.3600
Epoch 1/5
24/24 [==============================] - 10s 412ms/step - loss: 0.7388 - acc: 0.8476 - val_loss: 1.4086 - val_acc: 0.5210
Epoch 2/5
24/24 [==============================] - 7s 302ms/step - loss: 0.4562 - acc: 0.9350 - val_loss: 0.9815 - val_acc: 0.6730
Epoch 3/5
24/24 [==============================] - 7s 298ms/step - loss: 0.3088 - acc: 0.9707 - val_loss: 0.7960 - val_acc: 0.7710
Epoch 4/5
24/24 [==============================] - 7s 300ms/step - loss: 0.2147 - acc: 0.9883 - val_loss: 0.5896 - val_acc: 0.8690
Epoch 5/5
24/24 [==============================] - 7s 303ms/step - loss: 0.1586 - acc: 0.9966 - val_loss: 0.5113 - val_acc: 0.9060
Out[19]:
<keras.callbacks.History at 0x7fb1cd3ad0b8>

Not looking very encouraging... which isn't surprising since we know that CNNs are a much better choice for computer vision problems. So we'll try one.

Single conv layer

2 conv layers with max pooling followed by a simple dense network is a good simple CNN to start with:


In [20]:
def conv1(batches):
    model = Sequential([
            BatchNormalization(axis=1, input_shape=(3,224,224)),
            Conv2D(32,(3,3), activation='relu'),
            BatchNormalization(axis=1),
            MaxPooling2D((3,3)),
            Conv2D(64,(3,3), activation='relu'),
            BatchNormalization(axis=1),
            MaxPooling2D((3,3)),
            Flatten(),
            Dense(200, activation='relu'),
            BatchNormalization(),
            Dense(10, activation='softmax')
        ])

    model.compile(Adam(lr=1e-4), loss='categorical_crossentropy', metrics=['accuracy'])
    model.fit_generator(batches, steps_per_epoch, epochs=2, validation_data=val_batches, 
                     validation_steps=validation_steps)
    model.optimizer.lr = 0.001
    model.fit_generator(batches, steps_per_epoch, epochs=4, validation_data=val_batches, 
                     validation_steps=validation_steps)
    return model

In [21]:
conv1(batches)


Epoch 1/2
24/24 [==============================] - 14s 579ms/step - loss: 1.6482 - acc: 0.5001 - val_loss: 2.0257 - val_acc: 0.3220
Epoch 2/2
24/24 [==============================] - 13s 528ms/step - loss: 0.3435 - acc: 0.9365 - val_loss: 1.7710 - val_acc: 0.4580
Epoch 1/4
24/24 [==============================] - 14s 593ms/step - loss: 0.1012 - acc: 0.9935 - val_loss: 1.9505 - val_acc: 0.3610
Epoch 2/4
24/24 [==============================] - 9s 378ms/step - loss: 0.0423 - acc: 1.0000 - val_loss: 2.1335 - val_acc: 0.3540
Epoch 3/4
24/24 [==============================] - 9s 370ms/step - loss: 0.0218 - acc: 1.0000 - val_loss: 2.2278 - val_acc: 0.3530
Epoch 4/4
24/24 [==============================] - 9s 370ms/step - loss: 0.0155 - acc: 1.0000 - val_loss: 2.2392 - val_acc: 0.3510
Out[21]:
<keras.models.Sequential at 0x7fb1cd3b76a0>

The training set here is very rapidly reaching a very high accuracy. So if we could regularize this, perhaps we could get a reasonable result.

So, what kind of regularization should we try first? As we discussed in lesson 3, we should start with data augmentation.

Data augmentation

To find the best data augmentation parameters, we can try each type of data augmentation, one at a time. For each type, we can try four very different levels of augmentation, and see which is the best. In the steps below we've only kept the single best result we found. We're using the CNN we defined above, since we have already observed it can model the data quickly and accurately.

Width shift: move the image left and right -


In [22]:
gen_t = image.ImageDataGenerator(width_shift_range=0.1)
batches = get_batches(path+'train', gen_t, batch_size=batch_size)


Found 1500 images belonging to 10 classes.

In [23]:
model = conv1(batches)


Epoch 1/2
24/24 [==============================] - 23s 942ms/step - loss: 2.1960 - acc: 0.3186 - val_loss: 2.3333 - val_acc: 0.1860
Epoch 2/2
24/24 [==============================] - 12s 493ms/step - loss: 1.2196 - acc: 0.6140 - val_loss: 1.8536 - val_acc: 0.3150
Epoch 1/4
24/24 [==============================] - 18s 750ms/step - loss: 0.8203 - acc: 0.7581 - val_loss: 2.1126 - val_acc: 0.2210
Epoch 2/4
24/24 [==============================] - 14s 579ms/step - loss: 0.6040 - acc: 0.8186 - val_loss: 2.1821 - val_acc: 0.2440
Epoch 3/4
24/24 [==============================] - 14s 580ms/step - loss: 0.4531 - acc: 0.8806 - val_loss: 2.1092 - val_acc: 0.3410
Epoch 4/4
24/24 [==============================] - 14s 579ms/step - loss: 0.3807 - acc: 0.8951 - val_loss: 2.0756 - val_acc: 0.4050

Height shift: move the image up and down -


In [24]:
gen_t = image.ImageDataGenerator(height_shift_range=0.05)
batches = get_batches(path+'train', gen_t, batch_size=batch_size)


Found 1500 images belonging to 10 classes.

In [25]:
model = conv1(batches)


Epoch 1/2
24/24 [==============================] - 22s 908ms/step - loss: 1.9735 - acc: 0.3839 - val_loss: 2.4732 - val_acc: 0.3150
Epoch 2/2
24/24 [==============================] - 12s 481ms/step - loss: 0.8101 - acc: 0.7655 - val_loss: 1.8873 - val_acc: 0.3630
Epoch 1/4
24/24 [==============================] - 18s 753ms/step - loss: 0.4499 - acc: 0.8867 - val_loss: 1.7647 - val_acc: 0.4560
Epoch 2/4
24/24 [==============================] - 14s 581ms/step - loss: 0.3107 - acc: 0.9228 - val_loss: 1.9022 - val_acc: 0.4630
Epoch 3/4
24/24 [==============================] - 18s 742ms/step - loss: 0.2011 - acc: 0.9550 - val_loss: 1.8878 - val_acc: 0.5090
Epoch 4/4
24/24 [==============================] - 12s 489ms/step - loss: 0.1264 - acc: 0.9811 - val_loss: 1.8655 - val_acc: 0.4660

Random shear angles (max in radians) -


In [26]:
gen_t = image.ImageDataGenerator(shear_range=0.1)
batches = get_batches(path+'train', gen_t, batch_size=batch_size)


Found 1500 images belonging to 10 classes.

In [27]:
model = conv1(batches)


Epoch 1/2
24/24 [==============================] - 22s 905ms/step - loss: 1.7841 - acc: 0.4568 - val_loss: 2.6987 - val_acc: 0.2940
Epoch 2/2
24/24 [==============================] - 11s 476ms/step - loss: 0.5644 - acc: 0.8706 - val_loss: 1.5810 - val_acc: 0.4380
Epoch 1/4
24/24 [==============================] - 18s 754ms/step - loss: 0.2567 - acc: 0.9564 - val_loss: 1.6566 - val_acc: 0.4320
Epoch 2/4
24/24 [==============================] - 14s 590ms/step - loss: 0.1374 - acc: 0.9803 - val_loss: 1.7822 - val_acc: 0.3530
Epoch 3/4
24/24 [==============================] - 14s 584ms/step - loss: 0.0785 - acc: 0.9954 - val_loss: 1.9308 - val_acc: 0.2880
Epoch 4/4
24/24 [==============================] - 14s 590ms/step - loss: 0.0533 - acc: 0.9974 - val_loss: 1.9411 - val_acc: 0.3320

Rotation: max in degrees -


In [28]:
gen_t = image.ImageDataGenerator(rotation_range=15)
batches = get_batches(path+'train', gen_t, batch_size=batch_size)


Found 1500 images belonging to 10 classes.

In [29]:
model = conv1(batches)


Epoch 1/2
24/24 [==============================] - 18s 760ms/step - loss: 2.1598 - acc: 0.3401 - val_loss: 2.1741 - val_acc: 0.2620
Epoch 2/2
24/24 [==============================] - 14s 584ms/step - loss: 1.0411 - acc: 0.6836 - val_loss: 1.9610 - val_acc: 0.3160
Epoch 1/4
24/24 [==============================] - 22s 917ms/step - loss: 0.6806 - acc: 0.8063 - val_loss: 2.0874 - val_acc: 0.3160
Epoch 2/4
24/24 [==============================] - 12s 480ms/step - loss: 0.4594 - acc: 0.8808 - val_loss: 2.3098 - val_acc: 0.2480
Epoch 3/4
24/24 [==============================] - 14s 598ms/step - loss: 0.3546 - acc: 0.9056 - val_loss: 2.2914 - val_acc: 0.2670
Epoch 4/4
24/24 [==============================] - 14s 588ms/step - loss: 0.2598 - acc: 0.9400 - val_loss: 2.3044 - val_acc: 0.3030

Channel shift: randomly changing the R,G,B colors -


In [30]:
gen_t = image.ImageDataGenerator(channel_shift_range=20)
batches = get_batches(path+'train', gen_t, batch_size=batch_size)


Found 1500 images belonging to 10 classes.

In [31]:
model = conv1(batches)


Epoch 1/2
24/24 [==============================] - 10s 437ms/step - loss: 1.7513 - acc: 0.4785 - val_loss: 1.9210 - val_acc: 0.3480
Epoch 2/2
24/24 [==============================] - 9s 371ms/step - loss: 0.4443 - acc: 0.9005 - val_loss: 1.6773 - val_acc: 0.4710
Epoch 1/4
24/24 [==============================] - 11s 440ms/step - loss: 0.1581 - acc: 0.9759 - val_loss: 1.7266 - val_acc: 0.4550
Epoch 2/4
24/24 [==============================] - 13s 530ms/step - loss: 0.0651 - acc: 0.9972 - val_loss: 1.8107 - val_acc: 0.4330
Epoch 3/4
24/24 [==============================] - 9s 384ms/step - loss: 0.0307 - acc: 1.0000 - val_loss: 1.8463 - val_acc: 0.4370
Epoch 4/4
24/24 [==============================] - 9s 380ms/step - loss: 0.0177 - acc: 1.0000 - val_loss: 1.8323 - val_acc: 0.4170

And finally, putting it all together!


In [32]:
gen_t = image.ImageDataGenerator(rotation_range=15, height_shift_range=0.05, 
                shear_range=0.1, channel_shift_range=20, width_shift_range=0.1)
batches = get_batches(path+'train', gen_t, batch_size=batch_size)


Found 1500 images belonging to 10 classes.

In [33]:
model = conv1(batches)


Epoch 1/2
24/24 [==============================] - 19s 779ms/step - loss: 2.4391 - acc: 0.2263 - val_loss: 2.4587 - val_acc: 0.2390
Epoch 2/2
24/24 [==============================] - 19s 774ms/step - loss: 1.8407 - acc: 0.3868 - val_loss: 1.9039 - val_acc: 0.3680
Epoch 1/4
24/24 [==============================] - 22s 933ms/step - loss: 1.5542 - acc: 0.4849 - val_loss: 1.9373 - val_acc: 0.3300
Epoch 2/4
24/24 [==============================] - 12s 493ms/step - loss: 1.3974 - acc: 0.5381 - val_loss: 2.0541 - val_acc: 0.2020
Epoch 3/4
24/24 [==============================] - 15s 606ms/step - loss: 1.2788 - acc: 0.5787 - val_loss: 2.2025 - val_acc: 0.1740
Epoch 4/4
24/24 [==============================] - 15s 607ms/step - loss: 1.1586 - acc: 0.6207 - val_loss: 2.3139 - val_acc: 0.1390

At first glance, this isn't looking encouraging, since the validation set is poor and getting worse. But the training set is getting better, and still has a long way to go in accuracy - so we should try annealing our learning rate and running more epochs, before we make a decisions.


In [34]:
model.optimizer.lr = 0.0001
model.fit_generator(batches, steps_per_epoch, epochs=5, validation_data=val_batches, 
                 validation_steps=validation_steps)


Epoch 1/5
24/24 [==============================] - 19s 792ms/step - loss: 1.1171 - acc: 0.6365 - val_loss: 2.3372 - val_acc: 0.1660
Epoch 2/5
24/24 [==============================] - 18s 764ms/step - loss: 1.0486 - acc: 0.6621 - val_loss: 2.3370 - val_acc: 0.2090
Epoch 3/5
24/24 [==============================] - 12s 498ms/step - loss: 0.9903 - acc: 0.6737 - val_loss: 2.1373 - val_acc: 0.2760
Epoch 4/5
24/24 [==============================] - 18s 764ms/step - loss: 0.9099 - acc: 0.7065 - val_loss: 2.2080 - val_acc: 0.2590
Epoch 5/5
24/24 [==============================] - 16s 656ms/step - loss: 0.8636 - acc: 0.7277 - val_loss: 2.0751 - val_acc: 0.2880
Out[34]:
<keras.callbacks.History at 0x7fb1c33ede48>

Lucky we tried that - we starting to make progress! Let's keep going.


In [35]:
model.fit_generator(batches, steps_per_epoch, epochs=25, validation_data=val_batches, 
                 validation_steps=validation_steps)


Epoch 1/25
24/24 [==============================] - 19s 776ms/step - loss: 0.7937 - acc: 0.7508 - val_loss: 1.8848 - val_acc: 0.3470
Epoch 2/25
24/24 [==============================] - 15s 615ms/step - loss: 0.7514 - acc: 0.7615 - val_loss: 1.6572 - val_acc: 0.3910
Epoch 3/25
24/24 [==============================] - 15s 615ms/step - loss: 0.7045 - acc: 0.7862 - val_loss: 1.4946 - val_acc: 0.4350
Epoch 4/25
24/24 [==============================] - 15s 610ms/step - loss: 0.6865 - acc: 0.7863 - val_loss: 1.4053 - val_acc: 0.4590
Epoch 5/25
24/24 [==============================] - 15s 615ms/step - loss: 0.6509 - acc: 0.7954 - val_loss: 0.9458 - val_acc: 0.6660
Epoch 6/25
24/24 [==============================] - 15s 607ms/step - loss: 0.6694 - acc: 0.7916 - val_loss: 1.0536 - val_acc: 0.6260
Epoch 7/25
24/24 [==============================] - 19s 773ms/step - loss: 0.6400 - acc: 0.8002 - val_loss: 0.9714 - val_acc: 0.6350
Epoch 8/25
24/24 [==============================] - 12s 497ms/step - loss: 0.5977 - acc: 0.8127 - val_loss: 0.6185 - val_acc: 0.8120
Epoch 9/25
24/24 [==============================] - 19s 778ms/step - loss: 0.5806 - acc: 0.8259 - val_loss: 0.6233 - val_acc: 0.7940
Epoch 10/25
24/24 [==============================] - 16s 670ms/step - loss: 0.5331 - acc: 0.8384 - val_loss: 0.5689 - val_acc: 0.8160
Epoch 11/25
24/24 [==============================] - 12s 499ms/step - loss: 0.5569 - acc: 0.8318 - val_loss: 0.5353 - val_acc: 0.8310
Epoch 12/25
24/24 [==============================] - 15s 605ms/step - loss: 0.5150 - acc: 0.8601 - val_loss: 0.5292 - val_acc: 0.8350
Epoch 13/25
24/24 [==============================] - 15s 608ms/step - loss: 0.4631 - acc: 0.8601 - val_loss: 0.4252 - val_acc: 0.8690
Epoch 14/25
24/24 [==============================] - 15s 614ms/step - loss: 0.4922 - acc: 0.8468 - val_loss: 0.4257 - val_acc: 0.8660
Epoch 15/25
24/24 [==============================] - 18s 769ms/step - loss: 0.4629 - acc: 0.8718 - val_loss: 0.4758 - val_acc: 0.8490
Epoch 16/25
24/24 [==============================] - 16s 647ms/step - loss: 0.4203 - acc: 0.8786 - val_loss: 0.3506 - val_acc: 0.8950
Epoch 17/25
24/24 [==============================] - 16s 659ms/step - loss: 0.3982 - acc: 0.8845 - val_loss: 0.3110 - val_acc: 0.9200
Epoch 18/25
24/24 [==============================] - 12s 511ms/step - loss: 0.4190 - acc: 0.8813 - val_loss: 0.2709 - val_acc: 0.9210
Epoch 19/25
24/24 [==============================] - 15s 626ms/step - loss: 0.3767 - acc: 0.8880 - val_loss: 0.2918 - val_acc: 0.9220
Epoch 20/25
24/24 [==============================] - 15s 609ms/step - loss: 0.3950 - acc: 0.8884 - val_loss: 0.3416 - val_acc: 0.9050
Epoch 21/25
24/24 [==============================] - 15s 609ms/step - loss: 0.3817 - acc: 0.8912 - val_loss: 0.2400 - val_acc: 0.9250
Epoch 22/25
24/24 [==============================] - 15s 606ms/step - loss: 0.3677 - acc: 0.8894 - val_loss: 0.2519 - val_acc: 0.9280
Epoch 23/25
24/24 [==============================] - 15s 616ms/step - loss: 0.3653 - acc: 0.8921 - val_loss: 0.2456 - val_acc: 0.9270
Epoch 24/25
24/24 [==============================] - 15s 616ms/step - loss: 0.3615 - acc: 0.8923 - val_loss: 0.2843 - val_acc: 0.9020
Epoch 25/25
24/24 [==============================] - 15s 615ms/step - loss: 0.3168 - acc: 0.9091 - val_loss: 0.2496 - val_acc: 0.9230
Out[35]:
<keras.callbacks.History at 0x7fb1c3177710>

Amazingly, using nothing but a small sample, a simple (not pre-trained) model with no dropout, and data augmentation, we're getting results that would get us into the top 50% of the competition! This looks like a great foundation for our futher experiments.

To go further, we'll need to use the whole dataset, since dropout and data volumes are very related, so we can't tweak dropout without using all the data.


In [ ]: