Ensemble Design Pattern

Stacking is an Ensemble method which combines the outputs of a collection of models to make a prediction. The initial models, which are typically of different model types, are trained to completion on the full training dataset. Then, a secondary meta-model is trained using the initial model outputs as features. This second meta-model learns how to best combine the outcomes of the initial models to decrease the training error and can be any type of machine learning model.

Create a Stacking Ensemble model

In this notebook, we'll create an Ensemble of three neural network models and train on the natality dataset.

In [2]:
import os

import pandas as pd
import tensorflow as tf

from tensorflow import keras
from tensorflow import feature_column as fc
from tensorflow.keras import layers, models, Model

In [3]:
df = pd.read_csv("./data/babyweight_train.csv")

weight_pounds is_male mother_age plurality gestation_weeks mother_race
0 7.749249 False 12 Single(1) 40 1.0
1 7.561856 True 12 Single(1) 40 2.0
2 7.187070 False 12 Single(1) 34 3.0
3 6.375769 True 12 Single(1) 36 2.0
4 7.936641 False 12 Single(1) 35 NaN

Create our tf.data input pipeline

In [4]:
# Determine CSV, label, and key columns
# Create list of string column headers, make sure order matches.
CSV_COLUMNS = ["weight_pounds",

# Add string name for label column
LABEL_COLUMN = "weight_pounds"

# Set default values for each CSV column as a list of lists.
# Treat is_male and plurality as strings.
DEFAULTS = [[0.0], ["null"], [0.0], ["null"], [0.0], ["0"]]

In [5]:
def get_dataset(file_path):
    dataset = tf.data.experimental.make_csv_dataset(
        batch_size=15, # Artificially small to make examples easier to show.
    return dataset

train_data = get_dataset("./data/babyweight_train.csv")
test_data = get_dataset("./data/babyweight_eval.csv")

Check that our tf.data dataset:

In [6]:
def show_batch(dataset):
    for batch, label in dataset.take(1):
        for key, value in batch.items():
            print("{:20s}: {}".format(key,value.numpy()))


is_male             : [b'True' b'True' b'True' b'False' b'True' b'True' b'False' b'True' b'True'
 b'False' b'True' b'True' b'True' b'True' b'False']
mother_age          : [16. 17. 17. 18. 16. 17. 17. 18. 17. 17. 17. 17. 16. 16. 16.]
plurality           : [b'Single(1)' b'Single(1)' b'Single(1)' b'Single(1)' b'Single(1)'
 b'Single(1)' b'Single(1)' b'Single(1)' b'Single(1)' b'Single(1)'
 b'Single(1)' b'Single(1)' b'Single(1)' b'Single(1)' b'Single(1)']
gestation_weeks     : [38. 38. 40. 39. 41. 39. 40. 42. 40. 33. 33. 40. 39. 38. 39.]
mother_race         : [b'2.0' b'2.0' b'2.0' b'2.0' b'2.0' b'0' b'1.0' b'1.0' b'0' b'2.0' b'2.0'
 b'1.0' b'1.0' b'2.0' b'2.0']

Create our feature columns

In [7]:
numeric_columns = [fc.numeric_column("mother_age"),

    'plurality': ["Single(1)", "Twins(2)", "Triplets(3)",
                  "Quadruplets(4)", "Quintuplets(5)", "Multiple(2+)"],
    'is_male' : ["True", "False", "Unknown"],
    'mother_race': [str(_) for _ in df.mother_race.unique()]

categorical_columns = []
for feature, vocab in CATEGORIES.items():
  cat_col = fc.categorical_column_with_vocabulary_list(
        key=feature, vocabulary_list=vocab)

Create our ensemble models

We'll train three different neural network models.

In [8]:
inputs = {colname: tf.keras.layers.Input(
    name=colname, shape=(), dtype="float32")
    for colname in ["mother_age", "gestation_weeks"]}

inputs.update({colname: tf.keras.layers.Input(
    name=colname, shape=(), dtype="string")
    for colname in ["is_male", "plurality", "mother_race"]})

dnn_inputs = layers.DenseFeatures(categorical_columns+numeric_columns)(inputs)

# model_1
model1_h1 = layers.Dense(50, activation="relu")(dnn_inputs)
model1_h2 = layers.Dense(30, activation="relu")(model1_h1)
model1_output = layers.Dense(1, activation="relu")(model1_h2)
model_1 = tf.keras.models.Model(inputs=inputs, outputs=model1_output, name="model_1")

# model_2
model2_h1 = layers.Dense(64, activation="relu")(dnn_inputs)
model2_h2 = layers.Dense(32, activation="relu")(model2_h1)
model2_output = layers.Dense(1, activation="relu")(model2_h2)
model_2 = tf.keras.models.Model(inputs=inputs, outputs=model2_output, name="model_2")

# model_3
model3_h1 = layers.Dense(32, activation="relu")(dnn_inputs)
model3_output = layers.Dense(1, activation="relu")(model3_h1)
model_3 = tf.keras.models.Model(inputs=inputs, outputs=model3_output, name="model_3")

WARNING:tensorflow:From /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/feature_column/feature_column_v2.py:4267: IndicatorColumn._variable_shape (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.
WARNING:tensorflow:From /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/feature_column/feature_column_v2.py:4322: VocabularyListCategoricalColumn._num_buckets (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.

The function below trains a model and reports the MSE and RMSE on the test set.

In [9]:
# fit model on dataset
def fit_model(model):
    # define model
        optimizer='adam', metrics=['mse'])
    # fit model
    model.fit(train_data.shuffle(500), epochs=1)
    # evaluate model
    test_loss, test_mse = model.evaluate(test_data)
    print('\n\n{}:\nTest Loss {}, Test RMSE {}'.format(
        model.name, test_loss, test_mse**0.5))
    return model

In [10]:
# create directory for models
    print("directory already exists")

directory already exists

Next, we'll train each neural network and save the trained model to file.

In [11]:
members = [model_1, model_2, model_3]

# fit and save models
n_members = len(members)

for i in range(n_members):
    # fit model
    model = fit_model(members[i])
    # save model
    filename = 'models/model_' + str(i + 1) + '.h5'
    model.save(filename, save_format='tf')
    print('Saved {}\n'.format(filename))

17638/17638 [==============================] - 127s 7ms/step - loss: 1.1232 - mse: 1.1232
   4343/Unknown - 24s 6ms/step - loss: 2.9008 - mse: 2.9011

Test Loss 2.9008474301768974, Test RMSE 1.7032530850184173
Saved models/model_1.h5

17638/17638 [==============================] - 117s 7ms/step - loss: 1.1097 - mse: 1.1097
   4343/Unknown - 23s 5ms/step - loss: 2.0815 - mse: 2.0817

Test Loss 2.081478821759177, Test RMSE 1.4428068143465294
Saved models/model_2.h5

17638/17638 [==============================] - 114s 6ms/step - loss: 1.1293 - mse: 1.1293
   4343/Unknown - 23s 5ms/step - loss: 2.4174 - mse: 2.4173

Test Loss 2.417384987838966, Test RMSE 1.554769235887298
Saved models/model_3.h5

The RMSE varies on each of the neural networks.

Load the trained models and create the stacked ensemble model.

The function below loads the trained models and returns them in a list.

In [12]:
# load trained models from file
def load_models(n_models):
    all_models = []
    for i in range(n_models):
        filename = 'models/model_' + str(i + 1) + '.h5'
        # load model from file
        model = models.load_model(filename)
        # add to list of members
        print('>loaded %s' % filename)
    return all_models

In [14]:
# load all models
members = load_models(n_members)
print('Loaded %d models' % len(members))

>loaded models/model_1.h5
>loaded models/model_2.h5
>loaded models/model_3.h5
Loaded 3 models

We will need to freeze the layers of the pre-trained models since we won't train these models any further. The Stacked Ensemble will the trainable and learn how to best combine the results of the ensemble members.

In [15]:
# update all layers in all models to not be trainable
for i in range(n_members):
    model = members[i]
    for layer in model.layers:
        # make not trainable
        layer.trainable = False
        # rename to avoid 'unique layer name' issue
        layer._name = 'ensemble_' + str(i+1) + '_' + layer.name

Lastly, we'll create our Stacked Ensemble model. It is also a neural network. We'll use the Functional Keras API.

In [32]:
member_inputs = [model.input for model in members]

# concatenate merge output from each model
member_outputs = [model.output for model in members]
merge = layers.concatenate(member_outputs)
h1 = layers.Dense(30, activation='relu')(merge)
h2 = layers.Dense(20, activation='relu')(h1)
h3 = layers.Dense(10, activation='relu')(h2)
h4 = layers.Dense(5, activation='relu')(h2)
ensemble_output = layers.Dense(1, activation='relu')(h3)
ensemble_model = Model(inputs=member_inputs, outputs=ensemble_output)

# plot graph of ensemble
tf.keras.utils.plot_model(ensemble_model, show_shapes=True, to_file='ensemble_graph.png')

# compile
ensemble_model.compile(loss='mse', optimizer='adam', metrics=['mse'])

We need to adapt our tf.data pipeline to accommodate the multiple inputs for our Stacked Ensemble model.

In [33]:
FEATURES = ["is_male", "mother_age", "plurality",
            "gestation_weeks", "mother_race"]

# stack input features for our tf.dataset
def stack_features(features, label):
    for feature in FEATURES:
        for i in range(n_members):
            features['ensemble_' + str(i+1) + '_' + feature] = features[feature]
    return features, label

ensemble_data = train_data.map(stack_features).repeat(1)

In [34]:
ensemble_model.fit(ensemble_data.shuffle(500), epochs=1)

17638/17638 [==============================] - 165s 9ms/step - loss: 1.2281 - mse: 1.2281
<tensorflow.python.keras.callbacks.History at 0x7f89d832e710>

Lastly, we will evaluate our Stacked Ensemble against the test set.

In [35]:
val_loss, val_mse = ensemble_model.evaluate(test_data.map(stack_features))

   4343/Unknown - 35s 8ms/step - loss: 2.0102 - mse: 2.0102- 35s 8ms/step - loss: 2.0125 -

In [36]:
print("Validation RMSE: {}".format(val_mse**0.5))

Validation RMSE: 1.4178322129942251

