Overview

This codelab will demonstrate how to build a LSTM model for MNIST recognition using keras & how to convert the model to TensorFlow Lite.



In [0]:
!pip install tf-nightly

Prerequisites

We're going to override the environment variable TF_ENABLE_CONTROL_FLOW_V2 since for TensorFlow Lite control flows.


In [0]:
# This is important!
import os
os.environ['TF_ENABLE_CONTROL_FLOW_V2'] = '1'

import tensorflow as tf
import numpy as np

Step 1 Build the MNIST LSTM model.

Note we will be using tf.lite.experimental.nn.TFLiteLSTMCell & tf.lite.experimental.nn.dynamic_rnn in the tutorial.

Also note here, we're not trying to build the model to be a real world application, but only demonstrates how to use TensorFlow lite. You can a build a much better model using CNN models.

For more canonical lstm codelab, please see here.


In [0]:
# Step 1: Build the MNIST LSTM model.
def buildLstmLayer(inputs, num_layers, num_units):
  """Build the lstm layer.

  Args:
    inputs: The input data.
    num_layers: How many LSTM layers do we want.
    num_units: The unmber of hidden units in the LSTM cell.
  """
  lstm_cells = []
  for i in range(num_layers):
    lstm_cells.append(
        tf.lite.experimental.nn.TFLiteLSTMCell(
            num_units, forget_bias=0, name='rnn{}'.format(i)))
  lstm_layers = tf.keras.layers.StackedRNNCells(lstm_cells)
  # Assume the input is sized as [batch, time, input_size], then we're going
  # to transpose to be time-majored.
  transposed_inputs = tf.transpose(
      inputs, perm=[1, 0, 2])
  outputs, _ = tf.lite.experimental.nn.dynamic_rnn(
      lstm_layers,
      transposed_inputs,
      dtype='float32',
      time_major=True)
  unstacked_outputs = tf.unstack(outputs, axis=0)
  return unstacked_outputs[-1]

tf.reset_default_graph()
model = tf.keras.models.Sequential([
  tf.keras.layers.Input(shape=(28, 28), name='input'),
  tf.keras.layers.Lambda(buildLstmLayer, arguments={'num_layers' : 2, 'num_units' : 64}),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax, name='output')
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.summary()

Step 2: Train & Evaluate the model.

We will train the model using MNIST data.


In [0]:
# Step 2: Train & Evaluate the model.
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Cast x_train & x_test to float32.
x_train = x_train.astype(np.float32)
x_test = x_test.astype(np.float32)

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Step 3: Convert the Keras model to TensorFlow Lite model.

Note here: we

  1. Freeze the graph,
  2. Use tf.lite.experimental.convert_op_hints_to_stubs to convert the ophinted nodes,
  3. Remove the training graph;
  4. Converted to TensorFlow Lite model.

In [0]:
# Step 3: Convert the Keras model to TensorFlow Lite model.
sess = tf.keras.backend.get_session()
frozen_graph = tf.graph_util.convert_variables_to_constants(
  sess, sess.graph_def, ['output/Softmax'])
converted_graph = tf.lite.experimental.convert_op_hints_to_stubs(graph_def=frozen_graph)
converted_graph = tf.graph_util.remove_training_nodes(converted_graph)
input_tensor = sess.graph.get_tensor_by_name('input:0')
output_tensor = sess.graph.get_tensor_by_name('output/Softmax:0')
converter = tf.lite.TFLiteConverter(converted_graph, [input_tensor], [output_tensor])
tflite = converter.convert()
print('Model converted successfully!')

Step 4: Check the converted TensorFlow Lite model.

We're just going to load the TensorFlow Lite model and use the TensorFlow Lite python interpreter to verify the results.


In [0]:
# Step 4: Check the converted TensorFlow Lite model.
interpreter = tf.lite.Interpreter(model_content=tflite)

try:
  interpreter.allocate_tensors()
except ValueError:
  assert False

MINI_BATCH_SIZE = 1
correct_case = 0
for i in range(len(x_test)):
  input_index = (interpreter.get_input_details()[0]['index'])
  interpreter.set_tensor(input_index, x_test[i * MINI_BATCH_SIZE: (i + 1) * MINI_BATCH_SIZE])
  interpreter.invoke()
  output_index = (interpreter.get_output_details()[0]['index'])
  result = interpreter.get_tensor(output_index)
  # Reset all variables so it will not pollute other inferences.
  interpreter.reset_all_variables()
  # Evaluate.
  prediction = np.argmax(result)
  if prediction == y_test[i]:
    correct_case += 1

print('TensorFlow Lite Evaluation result is {}'.format(correct_case * 1.0 / len(x_test)))