Introducing the Keras Functional API

Learning Objectives

  1. Understand embeddings and how to create them with the feature column API
  2. Understand Deep and Wide models and when to use them
  3. Understand the Keras functional API and how to build a deep and wide model with it


In the last notebook, we learned about the Keras Sequential API. The Keras Functional API provides an alternate way of building models which is more flexible. With the Functional API, we can build models with more complex topologies, multiple input or output layers, shared layers or non-sequential data flows (e.g. residual layers).

In this notebook we'll use what we learned about feature columns to build a Wide & Deep model. Recall, that the idea behind Wide & Deep models is to join the two methods of learning through memorization and generalization by making a wide linear model and a deep learning model to accommodate both. You can have a look at the original research paper here: Wide & Deep Learning for Recommender Systems.


The Wide part of the model is associated with the memory element. In this case, we train a linear model with a wide set of crossed features and learn the correlation of this related data with the assigned label. The Deep part of the model is associated with the generalization element where we use embedding vectors for features. The best embeddings are then learned through the training process. While both of these methods can work well alone, Wide & Deep models excel by combining these techniques together.

# Ensure the right version of Tensorflow is installed.
!pip freeze | grep tensorflow==2.1 || pip install tensorflow==2.1

Start by importing the necessary libraries for this lab.

import datetime
import os
import shutil

import numpy as np
import pandas as pd
import tensorflow as tf

from matplotlib import pyplot as plt
from tensorflow import keras

from tensorflow import feature_column as fc

from tensorflow.keras import Model
from tensorflow.keras.layers import (
    Input, Dense, DenseFeatures, concatenate)
from tensorflow.keras.callbacks import TensorBoard


%matplotlib inline

Load raw data

We will use the taxifare dataset, using the CSV files that we created in the first notebook of this sequence. Those files have been saved into ../data.

!ls -l ../data/*.csv

Use to read the CSV files

We wrote these functions for reading data from the csv files above in the previous notebook. For this lab we will also include some additional engineered features in our model. In particular, we will compute the difference in latitude and longitude, as well as the Euclidean distance between the pick-up and drop-off locations. We can accomplish this by adding these new features to the features dictionary with the function add_engineered_features below.

Note that we include a call to this function when collecting our features dict and labels in the features_and_labels function below as well.

LABEL_COLUMN = 'fare_amount'
DEFAULTS = [[0.0], ['na'], [0.0], [0.0], [0.0], [0.0], [0.0], ['na']]
UNWANTED_COLS = ['pickup_datetime', 'key']

def features_and_labels(row_data):
    label = row_data.pop(LABEL_COLUMN)
    features = row_data
    for unwanted_col in UNWANTED_COLS:

    return features, label

def create_dataset(pattern, batch_size=1, mode='eval'):
    dataset =
        pattern, batch_size, CSV_COLUMNS, DEFAULTS)

    dataset =
    if mode == 'train':
        dataset = dataset.shuffle(buffer_size=1000).repeat()

    # take advantage of multi-threading; 1=AUTOTUNE
    dataset = dataset.prefetch(1)
    return dataset

Feature columns for Wide and Deep model

For the Wide columns, we will create feature columns of crossed features. To do this, we'll create a collection of Tensorflow feature columns to pass to the tf.feature_column.crossed_column constructor. The Deep columns will consist of numeric columns and the embedding columns we want to create.

Lab Task #1: In the cell below, create feature columns for our wide-and-deep model. You'll need to build

  1. bucketized columns using tf.feature_column.bucketized_column for the pickup and dropoff latitude and longitude,
  2. crossed columns using tf.feature_column.crossed_column for those bucketized columns, and
  3. embedding columns using tf.feature_column.embedding_column for the crossed columns.

# TODO 1

# 1. Bucketize latitudes and longitudes
latbuckets = np.linspace(start=38.0, stop=42.0, num=NBUCKETS).tolist()
lonbuckets = np.linspace(start=-76.0, stop=-72.0, num=NBUCKETS).tolist()

fc_bucketized_plat = # TODO: Your code goes here.
fc_bucketized_plon = # TODO: Your code goes here.
fc_bucketized_dlat = # TODO: Your code goes here.
fc_bucketized_dlon = # TODO: Your code goes here.

# 2. Cross features for locations
fc_crossed_dloc = # TODO: Your code goes here.
fc_crossed_ploc = # TODO: Your code goes here.
fc_crossed_pd_pair = # TODO: Your code goes here.

# 3. Create embedding columns for the crossed columns
fc_pd_pair = # TODO: Your code goes here.
fc_dloc = # TODO: Your code goes here.
fc_ploc = # TODO: Your code goes here.

Gather list of feature columns

Next we gather the list of wide and deep feature columns we'll pass to our Wide & Deep model in Tensorflow. Recall, wide columns are sparse, have linear relationship with the output while continuous columns are deep, have a complex relationship with the output. We will use our previously bucketized columns to collect crossed feature columns and sparse feature columns for our wide columns, and embedding feature columns and numeric features columns for the deep columns.

Lab Task #2: Collect the wide and deep columns into two separate lists. You'll have two lists: One called wide_columns containing the one-hot encoded features from the crossed features and one called deep_columns which contains numeric and embedding feature columns.

# TODO 2
wide_columns = [
    # One-hot encoded feature crosses
    # TODO: Your code goes here.

deep_columns = [
    # Embedding_column to "group" together ...
    # TODO: Your code goes here.

    # Numeric columns
    # TODO: Your code goes here.

Build a Wide and Deep model in Keras

To build a wide-and-deep network, we connect the sparse (i.e. wide) features directly to the output node, but pass the dense (i.e. deep) features through a set of fully connected layers. Here’s that model architecture looks using the Functional API.

First, we'll create our input columns using tf.keras.layers.Input.

inputs = {colname : Input(name=colname, shape=(), dtype='float32')
          for colname in INPUT_COLS

Then, we'll define our custom RMSE evaluation metric and build our wide and deep model.

Lab Task #3: Complete the code in the function build_model below so that it returns a compiled Keras model. The argument dnn_hidden_units should represent the number of units in each layer of your network. Use the Functional API to build a wide-and-deep model. Use the deep_columns you created above to build the deep layers and the wide_columns to create the wide layers. Once you have the wide and deep components, you will combine them to feed to a final fully connected layer.

def rmse(y_true, y_pred):
    return tf.sqrt(tf.reduce_mean(tf.square(y_pred - y_true)))

# TODO 3
def build_model(dnn_hidden_units):
    # Create the deep part of model
    deep = # TODO: Your code goes here.
    # Create the wide part of model
    wide = # TODO: Your code goes here.

    # Combine deep and wide parts of the model
    combined = # TODO: Your code goes here.

    # Map the combined outputs into a single prediction value
    output = # TODO: Your code goes here.
    # Finalize the model
    model = # TODO: Your code goes here.

    # Compile the keras model
    model.compile( # TODO: Your code goes here.
    return model

Next, we can call the build_model to create the model. Here we'll have two hidden layers, each with 10 neurons, for the deep part of our model. We can also use plot_model to see a diagram of the model we've created.

HIDDEN_UNITS = [10,10]

model = build_model(dnn_hidden_units=HIDDEN_UNITS)

tf.keras.utils.plot_model(model, show_shapes=False, rankdir='LR')

Next, we'll set up our training variables, create our datasets for training and validation, and train our model.

(We refer you the the blog post ML Design Pattern #3: Virtual Epochs for further details on why express the training in terms of NUM_TRAIN_EXAMPLES and NUM_EVALS and why, in this training code, the number of epochs is really equal to the number of evaluations we perform.)

NUM_TRAIN_EXAMPLES = 10000 * 5  # training dataset will repeat, wrap around
NUM_EVALS = 50  # how many times to evaluate
NUM_EVAL_EXAMPLES = 10000  # enough to get a reasonable sample

trainds = create_dataset(

evalds = create_dataset(

OUTDIR = "./taxi_trained"
shutil.rmtree(path=OUTDIR, ignore_errors=True) # start fresh each time

history =,

Just as before, we can examine the history to see how the RMSE changes through training on the train set and validation set.

RMSE_COLS = ['rmse', 'val_rmse']


