Mixed Input Representations Design Pattern

This design patterns refers to both:

  • Combining different types of data, like images + tabular metadata
  • Representing the same input in multiple formats

In [0]:
import numpy as np
import pandas as pd
import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import Model
from tensorflow.keras.layers import Dense, Embedding, Input, Flatten, Conv2D, MaxPooling2D

from google.cloud import bigquery

Combining text and tabular inputs

To demonstrate this we'll use a toy example dataset of restaurant reviews, defined below. For demo purposes, we've only defined the training dataset here.


In [0]:
reviews_data = {
    "review_text": ["The food was great, but it took forever to get seated.", "The tacos were life changing.", "This food made me question the presence of my taste buds."],
    "meal_type": ["lunch", "dinner", "dinner"],
    "meal_total": [50, 75, 60],
    "rating": [4, 5, 1]
}

Step 1: process review_text so it can be fed in as an embedding


In [27]:
vocab_size = 50
tokenize = keras.preprocessing.text.Tokenizer(num_words=vocab_size)
tokenize.fit_on_texts(reviews_data['review_text'])

reviews_train = tokenize.texts_to_sequences(reviews_data['review_text'])
max_sequence_len = 20
reviews_train = keras.preprocessing.sequence.pad_sequences(reviews_train, maxlen=max_sequence_len, padding='post')

print(reviews_train)


[[ 1  2  3  4  5  6  7  8  9 10 11  0  0  0  0  0  0  0  0  0]
 [ 1 12 13 14 15  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [16  2 17 18 19  1 20 21 22 23 24  0  0  0  0  0  0  0  0  0]]

Step 2: convert meal_type to one-hot


In [0]:
possible_meal_vocab = ['breakfast', 'lunch', 'dinner']
one_hot_meals = []

for i in reviews_data['meal_type']:
  one_hot_arr = [0] * len(possible_meal_vocab)
  one_index = possible_meal_vocab.index(i)
  one_hot_arr[one_index] = 1
  one_hot_meals.append(one_hot_arr)

Step 3: combine one-hot meal_type with meal_total into a single array


In [0]:
tabular_features = np.concatenate((np.array(one_hot_meals), np.expand_dims(reviews_data['meal_total'], axis=1)), axis=1)

Step 4: build the tabular and embedding layers with the Keras functional API


In [0]:
batch_size = len(reviews_data['review_text'])

embedding_input = Input(shape=(max_sequence_len,))
embedding_layer = Embedding(batch_size, 64)(embedding_input)
embedding_layer = Flatten()(embedding_layer)
embedding_layer = Dense(3, activation='relu')(embedding_layer)

tabular_input = Input(shape=(len(tabular_features[0]),))
tabular_layer = Dense(32, activation='relu')(tabular_input)

Step 5: concatenate the layers into a model


In [0]:
merged_input = keras.layers.concatenate([embedding_layer, tabular_layer])
merged_dense = Dense(16)(merged_input)
output = Dense(1)(merged_dense)

model = Model(inputs=[embedding_input, tabular_input], outputs=output)

In [32]:
# Preview the model architecture
model.summary()


Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_5 (InputLayer)            [(None, 20)]         0                                            
__________________________________________________________________________________________________
embedding_3 (Embedding)         (None, 20, 64)       192         input_5[0][0]                    
__________________________________________________________________________________________________
flatten_3 (Flatten)             (None, 1280)         0           embedding_3[0][0]                
__________________________________________________________________________________________________
input_6 (InputLayer)            [(None, 4)]          0                                            
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, 3)            3843        flatten_3[0][0]                  
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 32)           160         input_6[0][0]                    
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 35)           0           dense_4[0][0]                    
                                                                 dense_5[0][0]                    
__________________________________________________________________________________________________
dense_6 (Dense)                 (None, 16)           576         concatenate[0][0]                
__________________________________________________________________________________________________
dense_7 (Dense)                 (None, 1)            17          dense_6[0][0]                    
==================================================================================================
Total params: 4,788
Trainable params: 4,788
Non-trainable params: 0
__________________________________________________________________________________________________

Tabular data multiple ways

We'll build on the review dataset example in the previous section to show a quick demo of how you might represent rating as a feature in two different ways.

Step 1: write a function to bucket ratings using a threshold


In [0]:
def good_or_bad(rating):
  if rating > 3:
    return 1
  else:
    return 0

Step 2: Bucket the data and create a new input array with both the numeric and bucketed (boolean) rating


In [34]:
rating_processed = []

for i in reviews_data['rating']:
  rating_processed.append([i, good_or_bad(i)])

print(rating_processed)


[[4, 1], [5, 1], [1, 0]]

Mixed text representations

We'll use the Stack Overflow dataset in BigQuery to show how you can create a model with text represented as both Bag of Words and embeddings. To run the code in this section you'll need a Google Cloud project.


In [0]:
# Authenticate to connect to BigQuery
from google.colab import auth
auth.authenticate_user()

Step 1: Get the data from BigQuery and save it to a Pandas DataFrame.

Be sure to replace your-project below with the name of your Google Cloud project.


In [0]:
%%bigquery df --project your-project
SELECT
  title,
  answer_count,
  REPLACE(tags, "|", ",") as tags
FROM
  `bigquery-public-data.stackoverflow.posts_questions`
WHERE
  REGEXP_CONTAINS( tags, r"(?:keras|matplotlib|pandas)")
LIMIT 1000

In [49]:
# Preview the dataset, note that we'll use some of the
df.head()


Out[49]:
title answer_count tags
0 how to plot multiple lines while reading x and... 1 python,matplotlib,plot
1 How to assign child objects to parent objects ... 1 python,for-loop,pandas,dataframe
2 python: join group size to member rows in data... 2 python,pandas
3 Groupby Sum ignoring few columns 1 python,pandas
4 Why can't I freeze_panes on the xlsxwriter obj... 1 python,pandas,xlsxwriter

Step 2: Define the vocab size and max sequence length, and create an instance of the Tokenizer class, fitting it on our Stack Overflow data.


In [0]:
stacko_vocab_size = 200
stacko_sequence_len = 40

# Create a tokenizer for this data
stacko_tokenize = keras.preprocessing.text.Tokenizer(num_words=stacko_vocab_size)
stacko_tokenize.fit_on_texts(df['title'].values)

In [39]:
# Preview the first 20 words in the Tokenizer's vocabulary
list(stacko_tokenize.word_index.keys())[:20]


Out[39]:
['pandas',
 'in',
 'to',
 'a',
 'matplotlib',
 'of',
 'dataframe',
 'python',
 'with',
 'how',
 'and',
 'on',
 'the',
 'using',
 'data',
 'column',
 'plot',
 'from',
 'for',
 'columns']

Step 3: Convert the questions to sequences for the embedding representation


In [0]:
questions_train_embedding = stacko_tokenize.texts_to_sequences(df['title'].values)
questions_train_embedding = keras.preprocessing.sequence.pad_sequences(questions_train_embedding, maxlen=stacko_sequence_len, padding='post')

In [41]:
# Preview the embedding representation with the actual input text
print(df['title'].iloc[0])
print(questions_train_embedding[0])


how to plot multiple lines while reading x and y from files in a for loop?
[ 10   3  17  23  74 112 176  44  11  63  18 126   2   4  19 127   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0]

Step 4: Create the Bag of Words representation


In [42]:
questions_train_matrix = stacko_tokenize.texts_to_matrix(df['title'].values)
print(questions_train_matrix[0])


[0. 0. 1. 1. 1. 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 1. 1. 1. 0. 0. 0. 1.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0.]

Step 5: Create the embedding and BOW input layers


In [0]:
batch_size = len(df) # Note: we're using the whole dataset as the batch size for demo purposes

embedding_input = Input(shape=(stacko_sequence_len,))
embedding_layer = Embedding(batch_size, 64)(embedding_input)
embedding_layer = Flatten()(embedding_layer)
embedding_layer = Dense(32, activation='relu')(embedding_layer)

bow_input = Input(shape=(stacko_vocab_size,))
bow_layer = Dense(32, activation='relu')(bow_input)

Step 6: Create the model with the embedding and BOW layers


In [0]:
merged_text_input = keras.layers.concatenate([embedding_layer, bow_layer])
merged_dense_text = Dense(16)(merged_text_input)
merged_output = Dense(1)(merged_dense_text)

model = Model(inputs=[embedding_input, bow_input], outputs=merged_output)

In [45]:
model.summary()


Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_7 (InputLayer)            [(None, 40)]         0                                            
__________________________________________________________________________________________________
embedding_4 (Embedding)         (None, 40, 64)       64000       input_7[0][0]                    
__________________________________________________________________________________________________
flatten_4 (Flatten)             (None, 2560)         0           embedding_4[0][0]                
__________________________________________________________________________________________________
input_8 (InputLayer)            [(None, 200)]        0                                            
__________________________________________________________________________________________________
dense_8 (Dense)                 (None, 32)           81952       flatten_4[0][0]                  
__________________________________________________________________________________________________
dense_9 (Dense)                 (None, 32)           6432        input_8[0][0]                    
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 64)           0           dense_8[0][0]                    
                                                                 dense_9[0][0]                    
__________________________________________________________________________________________________
dense_10 (Dense)                (None, 16)           1040        concatenate_1[0][0]              
__________________________________________________________________________________________________
dense_11 (Dense)                (None, 1)            17          dense_10[0][0]                   
==================================================================================================
Total params: 153,441
Trainable params: 153,441
Non-trainable params: 0
__________________________________________________________________________________________________

Extracting tabular features from text

We'll create a new dataset of Stack Overflow questions, this time adding some tabular features extracted from the text and changing the prediction task to whether a question is answered.

Step 1: Get the data in BigQuery. Remember to replace your-project below with the name of our GCP project.


In [0]:
%%bigquery df_tabular --project your-project
SELECT
  title,
  answer_count,
  LENGTH(title) AS title_len,
  ARRAY_LENGTH(SPLIT(title, " ")) AS word_count,
  ENDS_WITH(title, "?") AS ends_with_q_mark,
  REPLACE(tags, "|", ",") as tags,
  IF
    (answer_count > 0,
      1,
      0) AS is_answered
FROM
  `bigquery-public-data.stackoverflow.posts_questions`
WHERE
  REGEXP_CONTAINS( tags, r"(?:keras|matplotlib|pandas)")
LIMIT 1000

In [52]:
df_tabular.head()


Out[52]:
title answer_count title_len word_count ends_with_q_mark tags is_answered
0 how to plot multiple lines while reading x and... 1 74 16 True python,matplotlib,plot 1
1 How to assign child objects to parent objects ... 1 65 11 False python,for-loop,pandas,dataframe 1
2 python: join group size to member rows in data... 2 51 9 False python,pandas 1
3 Groupby Sum ignoring few columns 1 32 5 False python,pandas 1
4 Why can't I freeze_panes on the xlsxwriter obj... 1 76 13 True python,pandas,xlsxwriter 1

Step 2: Extract the tabular features


In [57]:
stacko_tabular_features = df_tabular[['title_len', 'word_count', 'ends_with_q_mark']]
stacko_tabular_features['ends_with_q_mark'] = stacko_tabular_features['ends_with_q_mark'].astype(int)

stacko_tabular_features.head()


/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
Out[57]:
title_len word_count ends_with_q_mark
0 74 16 1
1 65 11 0
2 51 9 0
3 32 5 0
4 76 13 1

Step 3: Create an Input layer for the tabular features


In [0]:
stacko_tabular_input = Input(shape=(len(stacko_tabular_features.values[0]),))
stacko_tabular_layer = Dense(32, activation='relu')(stacko_tabular_input)

Step 4: Define a model using stacko_tabular_layer and our BOW layer from above


In [0]:
merged_mixed_input = keras.layers.concatenate([stacko_tabular_layer, bow_layer])
merged_mixed_text = Dense(16)(merged_mixed_input)
merged_mixed_output = Dense(1)(merged_mixed_text)

mixed_text_model = Model(inputs=[stacko_tabular_input, bow_input], outputs=merged_mixed_output)

In [66]:
mixed_text_model.summary()


Model: "model_5"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_11 (InputLayer)           [(None, 3)]          0                                            
__________________________________________________________________________________________________
input_8 (InputLayer)            [(None, 200)]        0                                            
__________________________________________________________________________________________________
dense_16 (Dense)                (None, 32)           128         input_11[0][0]                   
__________________________________________________________________________________________________
dense_9 (Dense)                 (None, 32)           6432        input_8[0][0]                    
__________________________________________________________________________________________________
concatenate_5 (Concatenate)     (None, 64)           0           dense_16[0][0]                   
                                                                 dense_9[0][0]                    
__________________________________________________________________________________________________
dense_21 (Dense)                (None, 16)           1040        concatenate_5[0][0]              
__________________________________________________________________________________________________
dense_22 (Dense)                (None, 1)            17          dense_21[0][0]                   
==================================================================================================
Total params: 7,617
Trainable params: 7,617
Non-trainable params: 0
__________________________________________________________________________________________________

Mixed image representations

Note: there is no dataset used here, but the example below provides a demo of how you could create a model that merges both pixel value and tiled representations.

Step 1: Define the pixel value and tiled representation layers


In [0]:
# Define image input layer (same shape for both pixel and tiled representation)
image_input = Input(shape=(28,28,1))

# Define pixel representation
pixel_layer = Flatten()(image_input)

# Define tiled representation
tiled_layer = Conv2D(filters=16, kernel_size=3, activation='relu')(image_input)
tiled_layer = MaxPooling2D()(tiled_layer)
tiled_layer = tf.keras.layers.Flatten()(tiled_layer)

Step 2: Concatenate the layers and create a model


In [0]:
merged_image_layers = keras.layers.concatenate([pixel_layer, tiled_layer])

merged_dense = Dense(16, activation='relu')(merged_image_layers)
merged_output = Dense(1)(merged_dense)

mixed_image_model = Model(inputs=image_input, outputs=merged_output)

In [70]:
mixed_image_model.summary()


Model: "model_6"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_13 (InputLayer)           [(None, 28, 28, 1)]  0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 26, 26, 16)   160         input_13[0][0]                   
__________________________________________________________________________________________________
input_12 (InputLayer)           [(None, 28, 28)]     0                                            
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 13, 13, 16)   0           conv2d[0][0]                     
__________________________________________________________________________________________________
flatten_5 (Flatten)             (None, 784)          0           input_12[0][0]                   
__________________________________________________________________________________________________
flatten_6 (Flatten)             (None, 2704)         0           max_pooling2d[0][0]              
__________________________________________________________________________________________________
concatenate_6 (Concatenate)     (None, 3488)         0           flatten_5[0][0]                  
                                                                 flatten_6[0][0]                  
__________________________________________________________________________________________________
dense_23 (Dense)                (None, 16)           55824       concatenate_6[0][0]              
__________________________________________________________________________________________________
dense_24 (Dense)                (None, 1)            17          dense_23[0][0]                   
==================================================================================================
Total params: 56,001
Trainable params: 56,001
Non-trainable params: 0
__________________________________________________________________________________________________

In [ ]:
### Combining images and metadata

This shows how we'd feed both images and associated metadata into a single model. To demonstrate this we'll be using the dummy dataset of tabular data below.

In [0]:
tabular_image_metadata = {
    'time': [9,10,2],
    'visibility': [0.2, 0.5, 0.1],
    'inclement_weather': [[0,0,1], [0,0,1], [1,0,0]],
    'location': [[0,1,0,0,0], [0,0,0,1,0], [1,0,0,0,0]] 
}

Step 1: Concatenate all tabular features


In [0]:
tabular_image_features = np.concatenate((
    np.expand_dims(tabular_image_metadata['time'], axis=1),
    np.expand_dims(tabular_image_metadata['visibility'], axis=1),    
    np.array(tabular_image_metadata['inclement_weather']),
    np.array(tabular_image_metadata['location'])
), axis=1)

In [82]:
# Preview the data
tabular_image_features


Out[82]:
array([[ 9. ,  0.2,  0. ,  0. ,  1. ,  0. ,  1. ,  0. ,  0. ,  0. ],
       [10. ,  0.5,  0. ,  0. ,  1. ,  0. ,  0. ,  0. ,  1. ,  0. ],
       [ 2. ,  0.1,  1. ,  0. ,  0. ,  1. ,  0. ,  0. ,  0. ,  0. ]])

Step 2: Define the tabular layer


In [0]:
image_tabular_input = Input(shape=(len(tabular_image_features[0]),))
image_tabular_layer = Dense(32, activation='relu')(image_tabular_input)

Step 3: Merge the tabular layer with the tiled layer we defined above


In [0]:
mixed_image_layers = keras.layers.concatenate([image_tabular_layer, tiled_layer])

merged_image_dense = Dense(16, activation='relu')(mixed_image_layers)
merged_image_output = Dense(1)(merged_image_dense)

mixed_image_tabular_model = Model(inputs=[image_tabular_input, tiled_input], outputs=merged_image_output)

In [86]:
mixed_image_tabular_model.summary()


Model: "model_7"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_13 (InputLayer)           [(None, 28, 28, 1)]  0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 26, 26, 16)   160         input_13[0][0]                   
__________________________________________________________________________________________________
input_14 (InputLayer)           [(None, 10)]         0                                            
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 13, 13, 16)   0           conv2d[0][0]                     
__________________________________________________________________________________________________
dense_25 (Dense)                (None, 32)           352         input_14[0][0]                   
__________________________________________________________________________________________________
flatten_6 (Flatten)             (None, 2704)         0           max_pooling2d[0][0]              
__________________________________________________________________________________________________
concatenate_8 (Concatenate)     (None, 2736)         0           dense_25[0][0]                   
                                                                 flatten_6[0][0]                  
__________________________________________________________________________________________________
dense_28 (Dense)                (None, 16)           43792       concatenate_8[0][0]              
__________________________________________________________________________________________________
dense_29 (Dense)                (None, 1)            17          dense_28[0][0]                   
==================================================================================================
Total params: 44,321
Trainable params: 44,321
Non-trainable params: 0
__________________________________________________________________________________________________

Copyright 2020 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License