LAB 4c: Create Keras Wide and Deep model.

Learning Objectives

  1. Set CSV Columns, label column, and column defaults
  2. Make dataset of features and label from CSV files
  3. Create input layers for raw features
  4. Create feature columns for inputs
  5. Create wide layer, deep dense hidden layers, and output layer
  6. Create custom evaluation metric
  7. Build wide and deep model tying all of the pieces together
  8. Train and evaluate

Introduction

In this notebook, we'll be using Keras to create a wide and deep model to predict the weight of a baby before it is born.

We'll start by defining the CSV column names, label column, and column defaults for our data inputs. Then, we'll construct a tf.data Dataset of features and the label from the CSV files and create inputs layers for the raw features. Next, we'll set up feature columns for the model inputs and build a wide and deep neural network in Keras. We'll create a custom evaluation metric and build our wide and deep model. Finally, we'll train and evaluate our model.

Each learning objective will correspond to a #TODO in this student lab notebook -- try to complete this notebook first and then review the solution notebook.

Load necessary libraries


In [1]:
import datetime
import os
import shutil
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
print(tf.__version__)


2.0.0

Verify CSV files exist

In the seventh lab of this series 4a_sample_babyweight, we sampled from BigQuery our train, eval, and test CSV files. Verify that they exist, otherwise go back to that lab and create them.


In [2]:
%%bash
ls *.csv


eval.csv
test.csv
train.csv

In [1]:
%%bash
head -5 *.csv


==> eval.csv <==
6.87621795178,False,33,Single(1),40
7.7492485093,Unknown,21,Single(1),38
8.86699217764,False,22,Single(1),38
6.60504936952,False,32,Single(1),40
8.313631900019999,True,36,Single(1),39

==> test.csv <==
7.5618555866,True,40,Twins(2),43
9.3586230219,Unknown,22,Single(1),40
8.5539357656,True,26,Single(1),37
5.81138522632,Unknown,36,Multiple(2+),36
7.06140625186,Unknown,23,Single(1),40

==> train.csv <==
10.18756112702,Unknown,23,Single(1),33
8.93754010148,True,40,Single(1),41
6.9996768185,Unknown,23,Single(1),38
8.65975765136,Unknown,19,Single(1),42
4.2549216566,True,20,Single(1),33

Create Keras model

Set CSV Columns, label column, and column defaults.

Now that we have verified that our CSV files exist, we need to set a few things that we will be using in our input function.

  • CSV_COLUMNS are going to be our header names of our columns. Make sure that they are in the same order as in the CSV files
  • LABEL_COLUMN is the header name of the column that is our label. We will need to know this to pop it from our features dictionary.
  • DEFAULTS is a list with the same length as CSV_COLUMNS, i.e. there is a default for each column in our CSVs. Each element is a list itself with the default value for that CSV column.

In [6]:
# Determine CSV, label, and key columns
CSV_COLUMNS = ["weight_pounds",
               "is_male",
               "mother_age",
               "plurality",
               "gestation_weeks"]
LABEL_COLUMN = "weight_pounds"

# Set default values for each CSV column.
# Treat is_male and plurality as strings.
DEFAULTS = [[0.0], ["null"], [0.0], ["null"], [0.0]]

Make dataset of features and label from CSV files.

Next, we will write an input_fn to read the data. Since we are reading from CSV files we can save ourself from trying to recreate the wheel and can use tf.data.experimental.make_csv_dataset. This will create a CSV dataset object. However we will need to divide the columns up into features and a label. We can do this by applying the map method to our dataset and popping our label column off of our dictionary of feature tensors.


In [7]:
def features_and_labels(row_data):
    """Splits features and labels from feature dictionary.

    Args:
        row_data: Dictionary of CSV column names and tensor values.
    Returns:
        Dictionary of feature tensors and label tensor.
    """
    label = row_data.pop(LABEL_COLUMN)

    return row_data, label  # features, label


def load_dataset(pattern, batch_size=1, mode=tf.estimator.ModeKeys.EVAL):
    """Loads dataset using the tf.data API from CSV files.

    Args:
        pattern: str, file pattern to glob into list of files.
        batch_size: int, the number of examples per batch.
        mode: tf.estimator.ModeKeys to determine if training or evaluating.
    Returns:
        `Dataset` object.
    """
    # Make a CSV dataset
    dataset = tf.data.experimental.make_csv_dataset(
        file_pattern=pattern,
        batch_size=batch_size,
        column_names=CSV_COLUMNS,
        column_defaults=DEFAULTS)

    # Map dataset to features and label
    dataset = dataset.map(map_func=features_and_labels)  # features, label

    # Shuffle and repeat for training
    if mode == tf.estimator.ModeKeys.TRAIN:
        dataset = dataset.shuffle(buffer_size=1000).repeat()

    # Take advantage of multi-threading; 1=AUTOTUNE
    dataset = dataset.prefetch(buffer_size=1)

    return dataset

Create input layers for raw features.

We'll need to get the data read in by our input function to our model function, but just how do we go about connecting the dots? We can use Keras input layers (tf.Keras.layers.Input) by defining:

  • shape: A shape tuple (integers), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors. Elements of this tuple can be None; 'None' elements represent dimensions where the shape is not known.
  • name: An optional name string for the layer. Should be unique in a model (do not reuse the same name twice). It will be autogenerated if it isn't provided.
  • dtype: The data type expected by the input, as a string (float32, float64, int32...)

In [8]:
def create_input_layers():
    """Creates dictionary of input layers for each feature.

    Returns:
        Dictionary of `tf.Keras.layers.Input` layers for each feature.
    """
    deep_inputs = {
        colname: tf.keras.layers.Input(
            name=colname, shape=(), dtype="float32")
        for colname in ["mother_age", "gestation_weeks"]
    }

    wide_inputs = {
        colname: tf.keras.layers.Input(
            name=colname, shape=(), dtype="string")
        for colname in ["is_male", "plurality"]
    }

    inputs = {**wide_inputs, **deep_inputs}

    return inputs

Create feature columns for inputs.

Next, define the feature columns. mother_age and gestation_weeks should be numeric. The others, is_male and plurality, should be categorical. Remember, only dense feature columns can be inputs to a DNN.


In [9]:
def categorical_fc(name, values):
    """Helper function to wrap categorical feature by indicator column.

    Args:
        name: str, name of feature.
        values: list, list of strings of categorical values.
    Returns:
        Categorical and indicator column of categorical feature.
    """
    cat_column = tf.feature_column.categorical_column_with_vocabulary_list(
            key=name, vocabulary_list=values)
    ind_column = tf.feature_column.indicator_column(
        categorical_column=cat_column)

    return cat_column, ind_column


def create_feature_columns(nembeds):
    """Creates wide and deep dictionaries of feature columns from inputs.

    Args:
        nembeds: int, number of dimensions to embed categorical column down to.
    Returns:
        Wide and deep dictionaries of feature columns.
    """
    deep_fc = {
        colname: tf.feature_column.numeric_column(key=colname)
        for colname in ["mother_age", "gestation_weeks"]
    }
    wide_fc = {}
    is_male, wide_fc["is_male"] = categorical_fc(
        "is_male", ["True", "False", "Unknown"])
    plurality, wide_fc["plurality"] = categorical_fc(
        "plurality", ["Single(1)", "Twins(2)", "Triplets(3)",
                      "Quadruplets(4)", "Quintuplets(5)", "Multiple(2+)"])

    # Bucketize the float fields. This makes them wide
    age_buckets = tf.feature_column.bucketized_column(
        source_column=deep_fc["mother_age"],
        boundaries=np.arange(15, 45, 1).tolist())
    wide_fc["age_buckets"] = tf.feature_column.indicator_column(
        categorical_column=age_buckets)

    gestation_buckets = tf.feature_column.bucketized_column(
        source_column=deep_fc["gestation_weeks"],
        boundaries=np.arange(17, 47, 1).tolist())
    wide_fc["gestation_buckets"] = tf.feature_column.indicator_column(
        categorical_column=gestation_buckets)

    # Cross all the wide columns, have to do the crossing before we one-hot
    crossed = tf.feature_column.crossed_column(
        keys=[age_buckets, gestation_buckets],
        hash_bucket_size=1000)
    deep_fc["crossed_embeds"] = tf.feature_column.embedding_column(
        categorical_column=crossed, dimension=nembeds)

    return wide_fc, deep_fc

Create wide and deep model and output layer.

So we've figured out how to get our inputs ready for machine learning but now we need to connect them to our desired output. Our model architecture is what links the two together. We need to create a wide and deep model now. The wide side will just be a linear regression or dense layer. For the deep side, let's create some hidden dense layers. All of this will end with a single dense output layer. This is regression so make sure the output layer activation is correct and that the shape is right.


In [10]:
def get_model_outputs(wide_inputs, deep_inputs, dnn_hidden_units):
    """Creates model architecture and returns outputs.

    Args:
        wide_inputs: Dense tensor used as inputs to wide side of model.
        deep_inputs: Dense tensor used as inputs to deep side of model.
        dnn_hidden_units: List of integers where length is number of hidden
            layers and ith element is the number of neurons at ith layer.
    Returns:
        Dense tensor output from the model.
    """
    # Hidden layers for the deep side
    layers = [int(x) for x in dnn_hidden_units]
    deep = deep_inputs
    for layerno, numnodes in enumerate(layers):
        deep = tf.keras.layers.Dense(
            units=numnodes,
            activation="relu",
            name="dnn_{}".format(layerno+1))(deep)
    deep_out = deep

    # Linear model for the wide side
    wide_out = tf.keras.layers.Dense(
        units=10, activation="relu", name="linear")(wide_inputs)

    # Concatenate the two sides
    both = tf.keras.layers.concatenate(
        inputs=[deep_out, wide_out], name="both")

    # Final output is a linear activation because this is regression
    output = tf.keras.layers.Dense(
        units=1, activation="linear", name="weight")(both)

    return output

Create custom evaluation metric.

We want to make sure that we have some useful way to measure model performance for us. Since this is regression, we would like to know the RMSE of the model on our evaluation dataset, however, this does not exist as a standard evaluation metric, so we'll have to create our own by using the true and predicted labels.


In [11]:
def rmse(y_true, y_pred):
    """Calculates RMSE evaluation metric.

    Args:
        y_true: tensor, true labels.
        y_pred: tensor, predicted labels.
    Returns:
        Tensor with value of RMSE between true and predicted labels.
    """
    return tf.sqrt(tf.reduce_mean((y_pred - y_true) ** 2))

Build wide and deep model tying all of the pieces together.

Excellent! We've assembled all of the pieces, now we just need to tie them all together into a Keras Model. This is NOT a simple feedforward model with no branching, side inputs, etc. so we can't use Keras' Sequential Model API. We're instead going to use Keras' Functional Model API. Here we will build the model using tf.keras.models.Model giving our inputs and outputs and then compile our model with an optimizer, a loss function, and evaluation metrics.


In [13]:
def build_wide_deep_model(dnn_hidden_units=[64, 32], nembeds=3):
    """Builds wide and deep model using Keras Functional API.

    Returns:
        `tf.keras.models.Model` object.
    """
    # Create input layers
    inputs = create_input_layers()

    # Create feature columns for both wide and deep
    wide_fc, deep_fc = create_feature_columns(nembeds)

    # The constructor for DenseFeatures takes a list of numeric columns
    # The Functional API in Keras requires: LayerConstructor()(inputs)
    wide_inputs = tf.keras.layers.DenseFeatures(
        feature_columns=wide_fc.values(), name="wide_inputs")(inputs)
    deep_inputs = tf.keras.layers.DenseFeatures(
        feature_columns=deep_fc.values(), name="deep_inputs")(inputs)

    # Get output of model given inputs
    output = get_model_outputs(wide_inputs, deep_inputs, dnn_hidden_units)

    # Build model and compile it all together
    model = tf.keras.models.Model(inputs=inputs, outputs=output)
    model.compile(optimizer="adam", loss="mse", metrics=[rmse, "mse"])

    return model

print("Here is our wide and deep architecture so far:\n")
model = build_wide_deep_model()
print(model.summary())


Here is our DNN architecture so far:

WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow_core/python/feature_column/feature_column_v2.py:4276: IndicatorColumn._variable_shape (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow_core/python/feature_column/feature_column_v2.py:4331: BucketizedColumn._num_buckets (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow_core/python/feature_column/feature_column_v2.py:4331: VocabularyListCategoricalColumn._num_buckets (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
gestation_weeks (InputLayer)    [(None,)]            0                                            
__________________________________________________________________________________________________
is_male (InputLayer)            [(None,)]            0                                            
__________________________________________________________________________________________________
mother_age (InputLayer)         [(None,)]            0                                            
__________________________________________________________________________________________________
plurality (InputLayer)          [(None,)]            0                                            
__________________________________________________________________________________________________
deep_inputs (DenseFeatures)     (None, 5)            3000        gestation_weeks[0][0]            
                                                                 is_male[0][0]                    
                                                                 mother_age[0][0]                 
                                                                 plurality[0][0]                  
__________________________________________________________________________________________________
dnn_1 (Dense)                   (None, 64)           384         deep_inputs[0][0]                
__________________________________________________________________________________________________
wide_inputs (DenseFeatures)     (None, 71)           0           gestation_weeks[0][0]            
                                                                 is_male[0][0]                    
                                                                 mother_age[0][0]                 
                                                                 plurality[0][0]                  
__________________________________________________________________________________________________
dnn_2 (Dense)                   (None, 32)           2080        dnn_1[0][0]                      
__________________________________________________________________________________________________
linear (Dense)                  (None, 10)           720         wide_inputs[0][0]                
__________________________________________________________________________________________________
both (Concatenate)              (None, 42)           0           dnn_2[0][0]                      
                                                                 linear[0][0]                     
__________________________________________________________________________________________________
weight (Dense)                  (None, 1)            43          both[0][0]                       
==================================================================================================
Total params: 6,227
Trainable params: 6,227
Non-trainable params: 0
__________________________________________________________________________________________________
None

We can visualize the wide and deep network using the Keras plot_model utility.


In [14]:
tf.keras.utils.plot_model(
    model=model, to_file="wd_model.png", show_shapes=False, rankdir="LR")


Out[14]:

Run and evaluate model

Train and evaluate.

We've built our Keras model using our inputs from our CSV files and the architecture we designed. Let's now run our model by training our model parameters and periodically running an evaluation to track how well we are doing on outside data as training goes on. We'll need to load both our train and eval datasets and send those to our model through the fit method. Make sure you have the right pattern, batch size, and mode when loading the data. Also, don't forget to add the callback to TensorBoard.


In [15]:
TRAIN_BATCH_SIZE = 32
NUM_TRAIN_EXAMPLES = 10000 * 5  # training dataset repeats, it'll wrap around
NUM_EVALS = 5  # how many times to evaluate
# Enough to get a reasonable sample, but not so much that it slows down
NUM_EVAL_EXAMPLES = 10000

trainds = load_dataset(
    pattern="train*",
    batch_size=TRAIN_BATCH_SIZE,
    mode=tf.estimator.ModeKeys.TRAIN)

evalds = load_dataset(
    pattern="eval*",
    batch_size=1000,
    mode=tf.estimator.ModeKeys.EVAL).take(count=NUM_EVAL_EXAMPLES // 1000)

steps_per_epoch = NUM_TRAIN_EXAMPLES // (TRAIN_BATCH_SIZE * NUM_EVALS)

logdir = os.path.join(
    "logs", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = tf.keras.callbacks.TensorBoard(
    log_dir=logdir, histogram_freq=1)

history = model.fit(
    trainds,
    validation_data=evalds,
    epochs=NUM_EVALS,
    steps_per_epoch=steps_per_epoch,
    callbacks=[tensorboard_callback])


WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow_core/python/data/experimental/ops/readers.py:521: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow_core/python/data/experimental/ops/readers.py:215: shuffle_and_repeat (from tensorflow.python.data.experimental.ops.shuffle_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.shuffle(buffer_size, seed)` followed by `tf.data.Dataset.repeat(count)`. Static tf.data optimizations will take care of using the fused implementation.
Train for 312 steps, validate for 10 steps
Epoch 1/5
312/312 [==============================] - 6s 19ms/step - loss: 4.6018 - rmse: 1.5056 - mse: 4.6018 - val_loss: 1.2495 - val_rmse: 1.1176 - val_mse: 1.2495
Epoch 2/5
312/312 [==============================] - 3s 9ms/step - loss: 1.2032 - rmse: 1.0866 - mse: 1.2032 - val_loss: 1.1678 - val_rmse: 1.0804 - val_mse: 1.1678
Epoch 3/5
312/312 [==============================] - 3s 9ms/step - loss: 1.1508 - rmse: 1.0627 - mse: 1.1508 - val_loss: 1.0926 - val_rmse: 1.0451 - val_mse: 1.0926
Epoch 4/5
312/312 [==============================] - 3s 10ms/step - loss: 1.0659 - rmse: 1.0234 - mse: 1.0659 - val_loss: 1.0576 - val_rmse: 1.0281 - val_mse: 1.0576
Epoch 5/5
312/312 [==============================] - 3s 10ms/step - loss: 1.0956 - rmse: 1.0370 - mse: 1.0956 - val_loss: 1.0621 - val_rmse: 1.0304 - val_mse: 1.0621

Visualize loss curve


In [16]:
# Plot
nrows = 1
ncols = 2
fig = plt.figure(figsize=(10, 5))

for idx, key in enumerate(["loss", "rmse"]):
    ax = fig.add_subplot(nrows, ncols, idx+1)
    plt.plot(history.history[key])
    plt.plot(history.history["val_{}".format(key)])
    plt.title("model {}".format(key))
    plt.ylabel(key)
    plt.xlabel("epoch")
    plt.legend(["train", "validation"], loc="upper left");


Save the model


In [17]:
OUTPUT_DIR = "babyweight_trained_wd"
shutil.rmtree(OUTPUT_DIR, ignore_errors=True)
EXPORT_PATH = os.path.join(
    OUTPUT_DIR, datetime.datetime.now().strftime("%Y%m%d%H%M%S"))
tf.saved_model.save(
    obj=model, export_dir=EXPORT_PATH)  # with default serving function
print("Exported trained model to {}".format(EXPORT_PATH))


WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1781: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: babyweight_trained/20191029083701/assets
Exported trained model to babyweight_trained/20191029083701

In [ ]:
!ls $EXPORT_PATH

Lab Summary:

In this lab, we started by defining the CSV column names, label column, and column defaults for our data inputs. Then, we constructed a tf.data Dataset of features and the label from the CSV files and created inputs layers for the raw features. Next, we set up feature columns for the model inputs and built a wide and deep neural network in Keras. We created a custom evaluation metric and built our wide and deep model. Finally, we trained and evaluated our model.

Copyright 2019 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License