Recurrent neural networks (RNNs) in keras


In [1]:
import numpy as np

from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding, LSTM, Bidirectional
from keras.datasets import imdb
from keras.utils import plot_model


Using TensorFlow backend.

Loading the IMDB sentiment classification dataset

Dataset used here is IMDB Movie reviews sentiment classification dataset available through keras:

  • 25,000 movies reviews from IMDB
  • binary labeling by sentiment (positive/negative)
  • reviews encoded as a sequence of integers representing word indices
  • words indexed by overall frequency, "1" is corresponds to most frequent word
  • imdb.load_data() comes with option to consider only the top most frequent words

In [2]:
max_features = 20000
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

In [3]:
x_train[0][:5]


Out[3]:
[1, 14, 22, 16, 43]

In [4]:
y_train[:5]


Out[4]:
array([1, 0, 0, 1, 0])

In [5]:
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Average train sequence length: {}'.format(np.mean(list(map(len, x_train)), dtype=int)))
print('Average test sequence length: {}'.format(np.mean(list(map(len, x_test)), dtype=int)))


25000 train sequences
25000 test sequences
Average train sequence length: 238
Average test sequence length: 230

In [6]:
maxlen = 300
# pad sequences shorter than maxlen with 0s; truncate 
# longer sequences to maxlen
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
y_train = np.array(y_train)
y_test = np.array(y_test)


x_train shape: (25000, 300)
x_test shape: (25000, 300)

Plain LSTM


In [7]:
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen))
model.add(LSTM(64))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

# try using different optimizers and different optimizer configs
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])

In [8]:
plot_model(model, to_file='img/chapter_10_rnn_no_bidir.png', show_shapes=True)


In [9]:
batch_size = 32
epochs = 10

In [10]:
model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          validation_data=[x_test, y_test])


Train on 25000 samples, validate on 25000 samples
Epoch 1/10
25000/25000 [==============================] - 699s - loss: 0.4058 - acc: 0.8138 - val_loss: 0.4562 - val_acc: 0.7861
Epoch 2/10
25000/25000 [==============================] - 729s - loss: 0.2252 - acc: 0.9159 - val_loss: 0.3223 - val_acc: 0.8666
Epoch 3/10
25000/25000 [==============================] - 630s - loss: 0.1582 - acc: 0.9431 - val_loss: 0.3676 - val_acc: 0.8656
Epoch 4/10
25000/25000 [==============================] - 1121s - loss: 0.1123 - acc: 0.9609 - val_loss: 0.4183 - val_acc: 0.8652
Epoch 5/10
25000/25000 [==============================] - 635s - loss: 0.0799 - acc: 0.9733 - val_loss: 0.5016 - val_acc: 0.8648
Epoch 6/10
25000/25000 [==============================] - 659s - loss: 0.0569 - acc: 0.9819 - val_loss: 0.5631 - val_acc: 0.8512
Epoch 7/10
25000/25000 [==============================] - 668s - loss: 0.0465 - acc: 0.9853 - val_loss: 0.5879 - val_acc: 0.8596
Epoch 8/10
25000/25000 [==============================] - 693s - loss: 0.0321 - acc: 0.9908 - val_loss: 0.6612 - val_acc: 0.8552
Epoch 9/10
25000/25000 [==============================] - 687s - loss: 0.0404 - acc: 0.9873 - val_loss: 0.6174 - val_acc: 0.8165
Epoch 10/10
25000/25000 [==============================] - 660s - loss: 0.0476 - acc: 0.9856 - val_loss: 0.6479 - val_acc: 0.8465
Out[10]:
<keras.callbacks.History at 0x12b213cf8>

Bi-directional LSTM


In [11]:
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen))
model.add(Bidirectional(LSTM(64)))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

# try using different optimizers and different optimizer configs
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])

In [12]:
plot_model(model, to_file='img/chapter_10_rnn.png', show_shapes=True)


In [13]:
batch_size = 32
epochs = 10

In [14]:
model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          validation_data=[x_test, y_test])


Train on 25000 samples, validate on 25000 samples
Epoch 1/10
25000/25000 [==============================] - 826s - loss: 0.4134 - acc: 0.8109 - val_loss: 0.3387 - val_acc: 0.8601
Epoch 2/10
25000/25000 [==============================] - 860s - loss: 0.2529 - acc: 0.9058 - val_loss: 0.3544 - val_acc: 0.8508
Epoch 3/10
25000/25000 [==============================] - 874s - loss: 0.1719 - acc: 0.9380 - val_loss: 0.3426 - val_acc: 0.8666
Epoch 4/10
25000/25000 [==============================] - 839s - loss: 0.1165 - acc: 0.9608 - val_loss: 0.3985 - val_acc: 0.8422
Epoch 5/10
25000/25000 [==============================] - 903s - loss: 0.0938 - acc: 0.9682 - val_loss: 0.4626 - val_acc: 0.8665
Epoch 6/10
25000/25000 [==============================] - 854s - loss: 0.0574 - acc: 0.9811 - val_loss: 0.4859 - val_acc: 0.8617
Epoch 7/10
25000/25000 [==============================] - 825s - loss: 0.0578 - acc: 0.9815 - val_loss: 0.5586 - val_acc: 0.8501
Epoch 8/10
25000/25000 [==============================] - 813s - loss: 0.0429 - acc: 0.9858 - val_loss: 0.6032 - val_acc: 0.8555
Epoch 9/10
25000/25000 [==============================] - 815s - loss: 0.0275 - acc: 0.9918 - val_loss: 0.6182 - val_acc: 0.8561
Epoch 10/10
25000/25000 [==============================] - 924s - loss: 0.0180 - acc: 0.9949 - val_loss: 0.7565 - val_acc: 0.7954
Out[14]:
<keras.callbacks.History at 0x1345dfb00>