This post will examine a quick improvement to the previously-constructed Abalone neural networks called "dropout".

A problem endemic to practically all machine learning algorithms is that of overfitting- an algorithm may learn the training data very well, but perform poorly when exposed to new data that it has never seen before ("testing" data). "Regularization" techniques have been developed to prevent an algorithm from learning the training data too well, in order to increase the accuracy on the testing data, which is the actual desired end effect.

"Dropout" is a regularization concept in neural networks that prevents overfitting. By randomly "dropping out" a percentage of nodes within the network, the ability of the network to learn "noise" is limited, which allows the network to focus on the actual relationships in the data ("signal").

As before, the abalone data is preprocessed using the following code:


In [1]:
# Data preprocessing from Part 1
import datetime
import pandas as pd
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Dropout
abalone_df = pd.read_csv('abalone.csv',names=['Sex','Length','Diameter','Height',
    'Whole Weight','Shucked Weight', 'Viscera Weight','Shell Weight', 'Rings'])
abalone_df['Male'] = (abalone_df['Sex']=='M').astype(int)
abalone_df['Female'] = (abalone_df['Sex']=='F').astype(int)
abalone_df['Infant'] = (abalone_df['Sex']=='I').astype(int)
abalone_df = abalone_df[abalone_df['Height']>0]
train, test = train_test_split(abalone_df, train_size=0.7)
x_train = train.drop(['Rings','Sex'], axis=1).values
y_train = pd.DataFrame(train['Rings']).values
x_test = test.drop(['Rings','Sex'], axis=1).values
y_test = pd.DataFrame(test['Rings']).values
# Constructing a list of models to test
hlayers = [[x,y] for x in range(5,31,5) for y in range(5,31,5)]
hlayers.extend([[1,10],[10,1],[2,2]])


Using TensorFlow backend.

The following code will iterate over the same models as before, but using a Dropout of 0.1, 0.2, 0.5, and 0.75:


In [2]:
# Iterate over the list of models, trying different dropout
begin = datetime.datetime.now()
results_dict = {}
for drop in [0.1, 0.2, 0.5, 0.75]:
    for layers in hlayers:
        abalone_model = Sequential([
            Dense(layers[0], input_dim=10),
            Dropout(drop),
            Dense(layers[1], activation='tanh'),
            Dropout(drop),
            Dense(1)])
        abalone_model.compile(optimizer='rmsprop',loss='mse',metrics=["mean_absolute_error"])
        results = abalone_model.fit(x_train, y_train, nb_epoch=50, verbose=0)
        score = abalone_model.evaluate(x_test, y_test)
        result_string = "[{},{}] drop={}".format(layers[0], layers[1], drop)
        results_dict[result_string] = score[1]
# Save the results in a DataFrame
results_df = pd.DataFrame.from_dict(results_dict, orient="index")
results_df.rename(columns={0 : "MAE"}, inplace=True)
seconds = (datetime.datetime.now() - begin).total_seconds()
sec_string = "Total elapsed seconds: {}".format(seconds)
print(sec_string)


1253/1253 [==============================] - 0s     
1253/1253 [==============================] - 0s      
1253/1253 [==============================] - 0s      
1253/1253 [==============================] - 0s      
1253/1253 [==============================] - 2s      
1253/1253 [==============================] - 2s      
1253/1253 [==============================] - 2s      
1253/1253 [==============================] - 2s      
 704/1253 [===============>..............] - ETA: 1s Total elapsed seconds: 1977.361075

Notice that the elapsed time was higher than before- dropout operations can be somewhat computationally expensive. The results are shown:


In [3]:
# Print the results matrix
results_df.sort_values('MAE').head(15)


Out[3]:
MAE
[30,10] drop=0.1 1.499875
[25,20] drop=0.1 1.502064
[15,30] drop=0.1 1.508090
[20,25] drop=0.2 1.513513
[20,10] drop=0.2 1.514683
[15,10] drop=0.1 1.514723
[20,20] drop=0.2 1.515138
[15,15] drop=0.2 1.515530
[10,15] drop=0.1 1.516270
[20,30] drop=0.2 1.517464
[25,10] drop=0.2 1.520284
[30,10] drop=0.5 1.521724
[30,25] drop=0.5 1.522384
[30,25] drop=0.2 1.522741
[20,15] drop=0.2 1.523161

The best results from Part 3 of the Abalone were around 1.46, so these models actually performed slightly worse than before. Notice that the best results also had the lowest percentage of dropout.