An LSTM for IMDB Review Classification


In [1]:
import numpy
from keras.datasets import imdb
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
numpy.random.seed(7)


Using TensorFlow backend.

Load data set, keeping only top n words, zero-ing the rest


In [2]:
top_words = 5000
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)

Truncate and pad input sentences


In [3]:
max_review_length = 500
X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)

Start by building LSTM with three layers...

  1. embedded with 32-length vectors to represent each word
  2. LSTM with 100 memory units
  3. dense output layer with single sigmoid neuron

...and develop it to the below via the following steps:

  • 3-layer LSTM: 87.7% after two epochs
  • ditto but with dropout: 86.8% peak after two epochs
  • ditto but with LSTM-layer specific recurrent_dropout: 84.8% peak after three epochs
  • final network with convolutional layer outperforms with ~88% peak after two epochs

In [4]:
embedding_vector_length = 32
model = Sequential()
model.add(Embedding(top_words, embedding_vector_length, input_length=max_review_length))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(LSTM(100))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=64)


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 500, 32)           160000    
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 500, 32)           3104      
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 250, 32)           0         
_________________________________________________________________
lstm_1 (LSTM)                (None, 100)               53200     
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 101       
=================================================================
Total params: 216,405
Trainable params: 216,405
Non-trainable params: 0
_________________________________________________________________
None
Train on 25000 samples, validate on 25000 samples
Epoch 1/10
25000/25000 [==============================] - 169s - loss: 0.4265 - acc: 0.7901 - val_loss: 0.3143 - val_acc: 0.8648
Epoch 2/10
25000/25000 [==============================] - 166s - loss: 0.2553 - acc: 0.9006 - val_loss: 0.2961 - val_acc: 0.8807
Epoch 3/10
25000/25000 [==============================] - 172s - loss: 0.2133 - acc: 0.9178 - val_loss: 0.2907 - val_acc: 0.8796
Epoch 4/10
25000/25000 [==============================] - 170s - loss: 0.1833 - acc: 0.9314 - val_loss: 0.3054 - val_acc: 0.8802
Epoch 5/10
25000/25000 [==============================] - 167s - loss: 0.1554 - acc: 0.9434 - val_loss: 0.3512 - val_acc: 0.8770
Epoch 6/10
25000/25000 [==============================] - 168s - loss: 0.1347 - acc: 0.9520 - val_loss: 0.3603 - val_acc: 0.8720
Epoch 7/10
25000/25000 [==============================] - 167s - loss: 0.1195 - acc: 0.9599 - val_loss: 0.3999 - val_acc: 0.8725
Epoch 8/10
25000/25000 [==============================] - 169s - loss: 0.1018 - acc: 0.9658 - val_loss: 0.4413 - val_acc: 0.8682
Epoch 9/10
25000/25000 [==============================] - 167s - loss: 0.0873 - acc: 0.9722 - val_loss: 0.4350 - val_acc: 0.8698
Epoch 10/10
25000/25000 [==============================] - 168s - loss: 0.0753 - acc: 0.9766 - val_loss: 0.4781 - val_acc: 0.8715
Out[4]:
<keras.callbacks.History at 0x7efcc02e3a90>

In [ ]: