Following on from Guide to the Sequential Model

10 May 2017 - WH Nixalo

Getting started with the Keras Sequential model

The Sequential model is a linear stack of layers.

You can create a Sequential model by passing a list of layer instances to the constructor:



In [ ]:

    
from keras.models import Sequential
from keras.layers import Dense, Activation

model = Sequential([Dense(32, input_shape=(784,)),
                    Activation('relu'),
                    Dense(10),
                    Activation('softmax'),])

You can also simply add layers via the .add() method:



In [ ]:

    
model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Activation('relu'))

Specifying the input shape

The first layer in a Sequential model needs to receive info about its input shape -- & only the first because the following layers can do automatic shape inference.

Pass an input_shape arg to the first layer. Tuple of ints or None, where None indicates any (+) int may be expected.
2D layers, like Dense (aka. Fully-Connected/Linear) via argument input_dim; and some 3D temporal layers via input_dim and input_length.
To specify a fixed batch size for inputs (useful for stateful RecNets), pass a batch_size argument. If you pass both batch_size=32 & input_shape=(6,8), it'll expect every batch of inputs to have batch shape (32, 6, 8)

Both of the below are strictly equivalent:



In [ ]:

    
model = Sequential()
mode.add(Dense(32, input_shape=(784,)))



In [ ]:

    
model = Sequential()
mode.add(Dense(32, input_dim=784))

Compilation

Before training a model, the learning process must be configured via the compile moethod. It has 3 parameters:

Optimizer. Either string identifier of an existing optimizer (rmspropr, adagrad, etc), or an instance of the Optimizer class. See: Optimizers
Loss Function. String identifier of an existing loss fn (categorical_crossentropy, mse, etc), or an objective function. See: Losses
List of Metrics. For any classification problem you'll want to set this to metrics=['accuracy']. Metric: string identifier of existing metric, or custom metric function.



In [ ]:

    
# For a multi-class classification problem
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
# For binary classification
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])
# For mean squared error regression
model.compile(optimizer='rmsprop',
              loss='mse')

# For custom metrics
import keras.backend as K

def mean_pred(y_true y_pred):
    return K.mean(y_pred)

model.compile(optimizer='rmspropr',
              loss='binary_crossentropy',
              metrics=['accuracy', mean_pred])

Training

Keras models are traind on NumPy arrays of input data & labels. You'll usually use the fit function to train a model. Fit Documentation



In [ ]:

    
# For a single-input model with 2 classes (binary classification):
model = Sequential()
model.add(Dense(32, activatin='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmspropr',
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(2, size=(1000, 1))

# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)



In [ ]:

    
# For a single-input model with 10 classes (categorical classification):
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(10, size=(1000, 1))

# Convert labels to categorical one-hot encoding
one_hot_labels = keras.utils.to_cateogircal(labels, num_classes=10)

# Train the model, iterating on the data in batches of 32 samples
model.fit(data, one_hot_labels, epochs=10, batch_size=32)

Keras Examples:

Github Folder

Multilayer Perceptron (MLP) for multi-class softmax classificaiton:



In [ ]:

    
from keras.models import Sequential
from keras.layers improt Dense, Dropout, Activation
from keras.optimizers import SGD

# Generate dummy data
import numpy as np
x_train = np.random.random((1000, 20))
y_train = keras.utils.to_categorical(np.random.randint(10, size=(1000, 1)), num_classes=10)
x_test = np.random.random((100, 20))
y_test = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)

model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden nodes
# in the first layer, specify the expected input data shape
# here: 20-dimensional vectors.
model.add(Dense(64, activation='reulu', input_dim=20))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
              optimizer=sgd,
              metrics=['accuracy'])

mode.fit(x_train, y_train,
         epochs=20,
         batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)

MLP for binary classification:



In [ ]:

    
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout

# Generate dummy data
x_train = np.random.random((1000, 20))
y_train = np.random.randint(2, size=(1000, 1))
x_test = np.random.random((100, 20))
y_test = np.random.randint(2, size=(100, 1))

model = Sequential()
model.add(Dense(64, input_dim=20, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

model.fit(x_train, y_train,
          epochs=20,
          batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)

VGG-like ConvNet:



In [ ]:

    
# <...>

Sequence classification with LSTM:



In [ ]:

    
# <...>

Sequence classification with 1D convolutions:



In [ ]:

    
# <...>

Stacked LSTM for sequence classification



In [ ]:

    
# <...>

Stacked LSTM model, rendered "stateful"



In [ ]:

    
# <...>



In [ ]:



In [ ]:



In [ ]:



In [ ]: