This notebook demonstrates how to implement a simple linear image model on MNIST using the tf.keras API. It builds the foundation for this companion notebook, which explores tackling the same problem with other types of models such as DNN and CNN.
This notebook uses TF2.0 Please check your tensorflow version using the cell below. If it is not 2.0, please run the pip line below and restart the kernel.
In [ ]:
!sudo chown -R jupyter:jupyter /home/jupyter/training-data-analyst
In [ ]:
import os
import shutil
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.callbacks import ModelCheckpoint, TensorBoard
from tensorflow.keras.layers import Dense, Flatten, Softmax
print(tf.__version__)
In [ ]:
!python3 -m pip freeze | grep 'tensorflow==2\|tensorflow-gpu==2' || \
python3 -m pip install tensorflow==2
In [ ]:
mnist = tf.keras.datasets.mnist.load_data()
(x_train, y_train), (x_test, y_test) = mnist
In [ ]:
HEIGHT, WIDTH = x_train[0].shape
NCLASSES = tf.size(tf.unique(y_train).y)
print("Image height x width is", HEIGHT, "x", WIDTH)
tf.print("There are", NCLASSES, "classes")
Each image is 28 x 28 pixels and represents a digit from 0 to 9. These images are black and white, so each pixel is a value from 0 (white) to 255 (black). Raw numbers can be hard to interpret sometimes, so we can plot the values to see the handwritten digit as an image.
In [ ]:
IMGNO = 12
# Uncomment to see raw numerical values.
# print(x_test[IMGNO])
plt.imshow(x_test[IMGNO].reshape(HEIGHT, WIDTH));
print("The label for image number", IMGNO, "is", y_test[IMGNO])
We can build our linear classifer using the tf.keras API, so we don't have to define or initialize our weights and biases. This happens automatically for us in the background. We can also add a softmax layer to transform the logits into probabilities. Finally, we can compile the model using categorical cross entropy in order to strongly penalize high probability predictions that were incorrect.
When building more complex models such as DNNs and CNNs our code will be more readable by using the tf.keras API. Let's get one working so we can test it and use it as a benchmark.
In [ ]:
def linear_model():
# TODO: Build a sequential model and compile it.
return model
In [ ]:
BUFFER_SIZE = 5000
BATCH_SIZE = 100
def scale(image, label):
# TODO
def load_dataset(training=True):
"""Loads MNIST dataset into a tf.data.Dataset"""
(x_train, y_train), (x_test, y_test) = mnist
x = x_train if training else x_test
y = y_train if training else y_test
# TODO: a) one-hot encode labels, apply `scale` function, and create dataset.
# One-hot encode the classes
if training:
# TODO
return dataset
In [ ]:
def create_shape_test(training):
dataset = load_dataset(training=training)
data_iter = dataset.__iter__()
(images, labels) = data_iter.get_next()
expected_image_shape = (BATCH_SIZE, HEIGHT, WIDTH)
expected_label_ndim = 2
assert(images.shape == expected_image_shape)
assert(labels.numpy().ndim == expected_label_ndim)
test_name = 'training' if training else 'eval'
print("Test for", test_name, "passed!")
create_shape_test(True)
create_shape_test(False)
Time to train the model! The original MNIST linear classifier had an error rate of 12%. Let's use that to sanity check that our model is learning.
In [ ]:
NUM_EPOCHS = 10
STEPS_PER_EPOCH = 100
model = linear_model()
train_data = load_dataset()
validation_data = load_dataset(training=False)
OUTDIR = "mnist_linear/"
checkpoint_callback = ModelCheckpoint(
OUTDIR, save_weights_only=True, verbose=1)
tensorboard_callback = TensorBoard(log_dir=OUTDIR)
history = model.fit(
# TODO: specify training/eval data, # epochs, steps per epoch.
verbose=2,
callbacks=[checkpoint_callback, tensorboard_callback]
)
In [ ]:
BENCHMARK_ERROR = .12
BENCHMARK_ACCURACY = 1 - BENCHMARK_ERROR
accuracy = history.history['accuracy']
val_accuracy = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
assert(accuracy[-1] > BENCHMARK_ACCURACY)
assert(val_accuracy[-1] > BENCHMARK_ACCURACY)
print("Test to beat benchmark accuracy passed!")
assert(accuracy[0] < accuracy[1])
assert(accuracy[1] < accuracy[-1])
assert(val_accuracy[0] < val_accuracy[1])
assert(val_accuracy[1] < val_accuracy[-1])
print("Test model accuracy is improving passed!")
assert(loss[0] > loss[1])
assert(loss[1] > loss[-1])
assert(val_loss[0] > val_loss[1])
assert(val_loss[1] > val_loss[-1])
print("Test loss is decreasing passed!")
Were you able to get an accuracy of over 90%? Not bad for a linear estimator! Let's make some predictions and see if we can find where the model has trouble. Change the range of values below to find incorrect predictions, and plot the corresponding images. What would you have guessed for these images?
TODO 2: Change the range below to find an incorrect prediction
In [ ]:
image_numbers = range(0, 10, 1) # Change me, please.
def load_prediction_dataset():
dataset = (x_test[image_numbers], y_test[image_numbers])
dataset = tf.data.Dataset.from_tensor_slices(dataset)
dataset = dataset.map(scale).batch(len(image_numbers))
return dataset
predicted_results = model.predict(load_prediction_dataset())
for index, prediction in enumerate(predicted_results):
predicted_value = np.argmax(prediction)
actual_value = y_test[image_numbers[index]]
if actual_value != predicted_value:
print("image number: " + str(image_numbers[index]))
print("the prediction was " + str(predicted_value))
print("the actual label is " + str(actual_value))
print("")
In [ ]:
bad_image_number = 8
plt.imshow(x_test[bad_image_number].reshape(HEIGHT, WIDTH));
It's understandable why the poor computer would have some trouble. Some of these images are difficult for even humans to read. In fact, we can see what the computer thinks each digit looks like.
Each of the 10 neurons in the dense layer of our model has 785 weights feeding into it. That's 1 weight for every pixel in the image + 1 for a bias term. These weights are flattened feeding into the model, but we can reshape them back into the original image dimensions to see what the computer sees.
TODO 3: Reshape the layer weights to be the shape of an input image and plot.
In [ ]:
DIGIT = 0 # Change me to be an integer from 0 to 9.
LAYER = 1 # Layer 0 flattens image, so no weights
WEIGHT_TYPE = 0 # 0 for variable weights, 1 for biases
dense_layer_weights = model.layers[LAYER].get_weights()
digit_weights = dense_layer_weights[WEIGHT_TYPE][:, DIGIT]
plt.imshow(digit_weights.reshape((HEIGHT, WIDTH)))
Did you recognize the digit the computer was trying to learn? Pretty trippy, isn't it! Even with a simple "brain", the computer can form an idea of what a digit should be. The human brain, however, uses layers and layers of calculations for image recognition. Ready for the next challenge? Click here to super charge our models with human-like vision.
Want to push your understanding further? Instead of using Keras' built in layers, try repeating the above exercise with your own custom layers.
Copyright 2019 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.