In [0]:
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Getting started: Training and prediction with Keras in AI Platform

Read on cloud.google.com View on GitHub

Overview

This tutorial shows how to train a neural network on AI Platform using the Keras sequential API and how to serve predictions from that model.

Keras is a high-level API for building and training deep learning models. tf.keras is TensorFlow’s implementation of this API.

The first two parts of the tutorial walk through training a model on Cloud AI Platform using prewritten Keras code, deploying the trained model to AI Platform, and serving online predictions from the deployed model.

The last part of the tutorial digs into the training code used for this model and ensuring it's compatible with AI Platform. To learn more about building machine learning models in Keras more generally, read TensorFlow's Keras tutorials.

Dataset

This tutorial uses the United States Census Income Dataset provided by the UC Irvine Machine Learning Repository. This dataset contains information about people from a 1994 Census database, including age, education, marital status, occupation, and whether they make more than $50,000 a year.

Objective

The goal is to train a deep neural network (DNN) using Keras that predicts whether a person makes more than $50,000 a year (target label) based on other Census information about the person (features).

This tutorial focuses more on using this model with AI Platform than on the design of the model itself. However, it's always important to think about potential problems and unintended consequences when building machine learning systems. See the Machine Learning Crash Course exercise about fairness to learn about sources of bias in the Census dataset, as well as machine learning fairness more generally.

Costs

This tutorial uses billable components of Google Cloud Platform (GCP):

  • AI Platform
  • Cloud Storage

Learn about AI Platform pricing and Cloud Storage pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.

Before you begin

You must do several things before you can train and deploy a model in AI Platform:

  • Set up your local development environment.
  • Set up a GCP project with billing and the necessary APIs enabled.
  • Authenticate your GCP account in this notebook.
  • Create a Cloud Storage bucket to store your training package and your trained model.

Set up your local development environment

If you are using Colab or AI Platform Notebooks, your environment already meets all the requirements to run this notebook. You can skip this step.

Otherwise, make sure your environment meets this notebook's requirements. You need the following:

  • The Google Cloud SDK
  • Git
  • Python 3
  • virtualenv
  • Jupyter notebook running in a virtual environment with Python 3

The Google Cloud guide to Setting up a Python development environment and the Jupyter installation guide provide detailed instructions for meeting these requirements. The following steps provide a condensed set of instructions:

  1. Install and initialize the Cloud SDK.

  2. Install Python 3.

  3. Install virtualenv and create a virtual environment that uses Python 3.

  4. Activate that environment and run pip install jupyter in a shell to install Jupyter.

  5. Run jupyter notebook in a shell to launch Jupyter.

  6. Open this notebook in the Jupyter Notebook Dashboard.

Set up your GCP project

The following steps are required, regardless of your notebook environment.

  1. Select or create a GCP project.

  2. Make sure that billing is enabled for your project.

  3. Enable the AI Platform ("Cloud Machine Learning Engine") and Compute Engine APIs.

  4. Enter your project ID in the cell below. Then run the cell to make sure the Cloud SDK uses the right project for all the commands in this notebook.

Note: Jupyter runs lines prefixed with ! as shell commands, and it interpolates Python variables prefixed with $ into these commands.


In [0]:
PROJECT_ID = "<your-project-id>" #@param {type:"string"}
! gcloud config set project $PROJECT_ID


Updated property [core/project].

Authenticate your GCP account

If you are using AI Platform Notebooks, your environment is already authenticated. Skip this step.

If you are using Colab, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.

Otherwise, follow these steps:

  1. In the GCP Console, go to the Create service account key page.

  2. From the Service account drop-down list, select New service account.

  3. In the Service account name field, enter a name.

  4. From the Role drop-down list, select Machine Learning Engine > AI Platform Admin and Storage > Storage Object Admin.

  5. Click Create. A JSON file that contains your key downloads to your local environment.

  6. Enter the path to your service account key as the GOOGLE_APPLICATION_CREDENTIALS variable in the cell below and run the cell.


In [0]:
import sys

# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

if 'google.colab' in sys.modules:
  from google.colab import auth as google_auth
  google_auth.authenticate_user()

# If you are running this notebook locally, replace the string below with the
# path to your service account key and run this cell to authenticate your GCP
# account.
else:
  %env GOOGLE_APPLICATION_CREDENTIALS ''

Create a Cloud Storage bucket

The following steps are required, regardless of your notebook environment.

When you submit a training job using the Cloud SDK, you upload a Python package containing your training code to a Cloud Storage bucket. AI Platform runs the code from this package. In this tutorial, AI Platform also saves the trained model that results from your job in the same bucket. You can then create an AI Platform model version based on this output in order to serve online predictions.

Set the name of your Cloud Storage bucket below. It must be unique across all Cloud Storage buckets.

You may also change the REGION variable, which is used for operations throughout the rest of this notebook. Make sure to choose a region where Cloud AI Platform services are available.


In [0]:
BUCKET_NAME = "<your-bucket-name>" #@param {type:"string"}
REGION = "us-central1" #@param {type:"string"}

Only if your bucket doesn't already exist: Run the following cell to create your Cloud Storage bucket.


In [0]:
! gsutil mb -l $REGION gs://$BUCKET_NAME


Creating gs://<your-bucket-name>/...

Finally, validate access to your Cloud Storage bucket by examining its contents:


In [0]:
! gsutil ls -al gs://$BUCKET_NAME

Part 1. Quickstart for training in AI Platform

This section of the tutorial walks you through submitting a training job to Cloud AI Platform. This job runs sample code that uses Keras to train a deep neural network on the United States Census data. It outputs the trained model as a TensorFlow SavedModel directory in your Cloud Storage bucket.

Get training code and dependencies

First, download the training code and change the notebook's working directory:


In [0]:
# Clone the repository of AI Platform samples
! git clone --depth 1 https://github.com/GoogleCloudPlatform/cloudml-samples

# Set the working directory to the sample code directory
%cd cloudml-samples/census/tf-keras


Cloning into 'cloudml-samples'...
remote: Enumerating objects: 404, done.
remote: Counting objects: 100% (404/404), done.
remote: Compressing objects: 100% (333/333), done.
remote: Total 404 (delta 110), reused 199 (delta 46), pack-reused 0
Receiving objects: 100% (404/404), 22.67 MiB | 19.76 MiB/s, done.
Resolving deltas: 100% (110/110), done.
/content/cloudml-samples/census/tf-keras

Notice that the training code is structured as a Python package in the trainer/ subdirectory:


In [0]:
# `ls` shows the working directory's contents. The `p` flag adds trailing 
# slashes to subdirectory names. The `R` flag lists subdirectories recursively.
! ls -pR


.:
README.md  requirements.txt  trainer/

./trainer:
__init__.py  model.py  task.py	util.py

Run the following cell to install Python dependencies needed to train the model locally. When you run the training job in AI Platform, dependencies are preinstalled based on the runtime version you choose.


In [0]:
! pip install -r requirements.txt


Requirement already satisfied: numpy>=1.14 in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 1)) (1.17.4)
Requirement already satisfied: pandas>=0.22 in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 2)) (0.25.3)
Requirement already satisfied: six>=1.11 in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 3)) (1.13.0)
Requirement already satisfied: tensorflow<2,>=1.15 in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 4)) (1.15.0)
Requirement already satisfied: python-dateutil>=2.6.1 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.22->-r requirements.txt (line 2)) (2.8.1)
Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.22->-r requirements.txt (line 2)) (2019.3)
Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (1.1.0)
Requirement already satisfied: gast==0.2.2 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (0.2.2)
Requirement already satisfied: astor>=0.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (0.8.1)
Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (3.11.1)
Requirement already satisfied: tensorflow-estimator==1.15.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (1.15.1)
Requirement already satisfied: wrapt>=1.11.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (1.11.2)
Requirement already satisfied: wheel>=0.26 in /usr/lib/python3/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (0.29.0)
Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (3.1.0)
Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (1.25.0)
Requirement already satisfied: absl-py>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (0.8.1)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (1.1.0)
Requirement already satisfied: keras-applications>=1.0.8 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (1.0.8)
Requirement already satisfied: tensorboard<1.16.0,>=1.15.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (1.15.0)
Requirement already satisfied: google-pasta>=0.1.6 in /usr/local/lib/python3.7/dist-packages (from tensorflow<2,>=1.15->-r requirements.txt (line 4)) (0.1.8)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/dist-packages (from protobuf>=3.6.1->tensorflow<2,>=1.15->-r requirements.txt (line 4)) (42.0.2)
Requirement already satisfied: h5py in /usr/local/lib/python3.7/dist-packages (from keras-applications>=1.0.8->tensorflow<2,>=1.15->-r requirements.txt (line 4)) (2.10.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.7/dist-packages (from tensorboard<1.16.0,>=1.15.0->tensorflow<2,>=1.15->-r requirements.txt (line 4)) (3.1.1)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.7/dist-packages (from tensorboard<1.16.0,>=1.15.0->tensorflow<2,>=1.15->-r requirements.txt (line 4)) (0.16.0)

Train your model locally

Before training on AI Platform, train the job locally to verify the file structure and packaging is correct.

For a complex or resource-intensive job, you may want to train locally on a small sample of your dataset to verify your code. Then you can run the job on AI Platform to train on the whole dataset.

This sample runs a relatively quick job on a small dataset, so the local training and the AI Platform job run the same code on the same data.

Run the following cell to train a model locally:


In [0]:
# Explicitly tell `gcloud ai-platform local train` to use Python 3 
! gcloud config set ml_engine/local_python $(which python3)

# This is similar to `python -m trainer.task --job-dir local-training-output`
# but it better replicates the AI Platform environment, especially for
# distributed training (not applicable here).
! gcloud ai-platform local train \
  --package-path trainer \
  --module-name trainer.task \
  --job-dir local-training-output


Updated property [ml_engine/local_python].
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2019-03-27 17:53:24.297156: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200000000 Hz
2019-03-27 17:53:24.297428: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x2441b80 executing computations on platform Host. Devices:
2019-03-27 17:53:24.297464: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.

Epoch 00001: LearningRateScheduler reducing learning rate to 0.02.
Epoch 1/20
254/254 [==============================] - 1s 5ms/step - loss: 0.5032 - acc: 0.7890 - val_loss: 0.4553 - val_acc: 0.8030

Epoch 00002: LearningRateScheduler reducing learning rate to 0.015.
Epoch 2/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3616 - acc: 0.8362 - val_loss: 0.3273 - val_acc: 0.8468

Epoch 00003: LearningRateScheduler reducing learning rate to 0.0125.
Epoch 3/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3411 - acc: 0.8450 - val_loss: 0.3294 - val_acc: 0.8447

Epoch 00004: LearningRateScheduler reducing learning rate to 0.01125.
Epoch 4/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3362 - acc: 0.8454 - val_loss: 0.3566 - val_acc: 0.8410

Epoch 00005: LearningRateScheduler reducing learning rate to 0.010625.
Epoch 5/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3325 - acc: 0.8469 - val_loss: 0.3264 - val_acc: 0.8507

Epoch 00006: LearningRateScheduler reducing learning rate to 0.0103125.
Epoch 6/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3316 - acc: 0.8471 - val_loss: 0.3228 - val_acc: 0.8515

Epoch 00007: LearningRateScheduler reducing learning rate to 0.01015625.
Epoch 7/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3297 - acc: 0.8475 - val_loss: 0.3308 - val_acc: 0.8452

Epoch 00008: LearningRateScheduler reducing learning rate to 0.010078125.
Epoch 8/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3287 - acc: 0.8479 - val_loss: 0.3449 - val_acc: 0.8394

Epoch 00009: LearningRateScheduler reducing learning rate to 0.0100390625.
Epoch 9/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3274 - acc: 0.8488 - val_loss: 0.3213 - val_acc: 0.8522

Epoch 00010: LearningRateScheduler reducing learning rate to 0.01001953125.
Epoch 10/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3256 - acc: 0.8488 - val_loss: 0.3302 - val_acc: 0.8508

Epoch 00011: LearningRateScheduler reducing learning rate to 0.010009765625.
Epoch 11/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3280 - acc: 0.8502 - val_loss: 0.3342 - val_acc: 0.8443

Epoch 00012: LearningRateScheduler reducing learning rate to 0.010004882812500001.
Epoch 12/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3283 - acc: 0.8502 - val_loss: 0.3511 - val_acc: 0.8506

Epoch 00013: LearningRateScheduler reducing learning rate to 0.01000244140625.
Epoch 13/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3258 - acc: 0.8498 - val_loss: 0.3450 - val_acc: 0.8491

Epoch 00014: LearningRateScheduler reducing learning rate to 0.010001220703125.
Epoch 14/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3272 - acc: 0.8500 - val_loss: 0.3220 - val_acc: 0.8524

Epoch 00015: LearningRateScheduler reducing learning rate to 0.0100006103515625.
Epoch 15/20
254/254 [==============================] - 1s 3ms/step - loss: 0.3261 - acc: 0.8498 - val_loss: 0.3221 - val_acc: 0.8520

Epoch 00016: LearningRateScheduler reducing learning rate to 0.01000030517578125.
Epoch 16/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3239 - acc: 0.8509 - val_loss: 0.3232 - val_acc: 0.8480

Epoch 00017: LearningRateScheduler reducing learning rate to 0.010000152587890625.
Epoch 17/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3275 - acc: 0.8484 - val_loss: 0.3377 - val_acc: 0.8540

Epoch 00018: LearningRateScheduler reducing learning rate to 0.010000076293945313.
Epoch 18/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3242 - acc: 0.8490 - val_loss: 0.3275 - val_acc: 0.8522

Epoch 00019: LearningRateScheduler reducing learning rate to 0.010000038146972657.
Epoch 19/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3236 - acc: 0.8507 - val_loss: 0.3302 - val_acc: 0.8471

Epoch 00020: LearningRateScheduler reducing learning rate to 0.010000019073486329.
Epoch 20/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3259 - acc: 0.8490 - val_loss: 0.3218 - val_acc: 0.8518
WARNING:tensorflow:This model was compiled with a Keras optimizer (<tensorflow.python.keras.optimizers.RMSprop object at 0x7fa74da22278>) but is being saved in TensorFlow format with `save_weights`. The model's weights will be saved, but unlike with TensorFlow optimizers in the TensorFlow format the optimizer's state will not be saved.

Consider using a TensorFlow optimizer from `tf.train`.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/network.py:1436: update_checkpoint_state (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.train.CheckpointManager to manage checkpoints rather than manually editing the Checkpoint proto.
WARNING:tensorflow:Model was compiled with an optimizer, but the optimizer is not from `tf.train` (e.g. `tf.train.AdagradOptimizer`). Only the serving graph was exported. The train and evaluate graphs were not added to the SavedModel.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:205: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['serving_default']
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: None
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: local-training-output/keras_export/1553709223/saved_model.pb
Model exported to:  local-training-output/keras_export/1553709223

Train your model using AI Platform

Next, submit a training job to AI Platform. This runs the training module in the cloud and exports the trained model to Cloud Storage.

First, give your training job a name and choose a directory within your Cloud Storage bucket for saving intermediate and output files:


In [0]:
JOB_NAME = 'my_first_keras_job'
JOB_DIR = 'gs://' + BUCKET_NAME + '/keras-job-dir'

Run the following command to package the trainer/ directory, upload it to the specified --job-dir, and instruct AI Platform to run the trainer.task module from that package.

The --stream-logs flag lets you view training logs in the cell below. You can also see logs and other job details in the GCP Console.

Hyperparameter tuning

You can optionally perform hyperparameter tuning by using the included hptuning_config.yaml configuration file. This file tells AI Platform to tune the batch size and learning rate for training over multiple trials to maximize accuracy.

In this example, the training code uses a TensorBoard callback, which creates TensorFlow Summary Events during training. AI Platform uses these events to track the metric you want to optimize. Learn more about hyperparameter tuning in AI Platform Training.


In [0]:
! gcloud ai-platform jobs submit training $JOB_NAME \
  --package-path trainer/ \
  --module-name trainer.task \
  --region $REGION \
  --python-version 3.7 \
  --runtime-version 1.15 \
  --job-dir $JOB_DIR \
  --stream-logs


Job [my_first_keras_job] submitted successfully.
INFO	2019-03-27 17:54:27 +0000	service		Validating job requirements...
INFO	2019-03-27 17:54:27 +0000	service		Job creation request has been successfully validated.
INFO	2019-03-27 17:54:27 +0000	service		Job my_first_keras_job is queued.
INFO	2019-03-27 17:54:27 +0000	service		Waiting for job to be provisioned.
INFO	2019-03-27 17:54:30 +0000	service		Waiting for training program to start.
INFO	2019-03-27 17:56:09 +0000	master-replica-0		Running task with arguments: --cluster={"master": ["127.0.0.1:2222"]} --task={"type": "master", "index": 0} --job={  "package_uris": ["gs://<your-bucket-name>/keras-job-dir/packages/dcc159f40836cff74a27866227b327b0a8ccb5266194e76cff5368266b6d1cdd/trainer-0.0.0.tar.gz"],  "python_module": "trainer.task",  "region": "us-central1",  "runtime_version": "1.15",  "job_dir": "gs://<your-bucket-name>/keras-job-dir",  "run_on_raw_vm": true,  "python_version": "3.7"}
WARNING	2019-03-27 17:56:09 +0000	master-replica-0		From /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
WARNING	2019-03-27 17:56:09 +0000	master-replica-0		Instructions for updating:
WARNING	2019-03-27 17:56:09 +0000	master-replica-0		Colocations handled automatically by placer.
INFO	2019-03-27 17:56:18 +0000	master-replica-0		Running module trainer.task.
INFO	2019-03-27 17:56:18 +0000	master-replica-0		Downloading the package: gs://<your-bucket-name>/keras-job-dir/packages/dcc159f40836cff74a27866227b327b0a8ccb5266194e76cff5368266b6d1cdd/trainer-0.0.0.tar.gz
INFO	2019-03-27 17:56:18 +0000	master-replica-0		Running command: gsutil -q cp gs://<your-bucket-name>/keras-job-dir/packages/dcc159f40836cff74a27866227b327b0a8ccb5266194e76cff5368266b6d1cdd/trainer-0.0.0.tar.gz trainer-0.0.0.tar.gz
INFO	2019-03-27 17:56:20 +0000	master-replica-0		Installing the package: gs://<your-bucket-name>/keras-job-dir/packages/dcc159f40836cff74a27866227b327b0a8ccb5266194e76cff5368266b6d1cdd/trainer-0.0.0.tar.gz
INFO	2019-03-27 17:56:20 +0000	master-replica-0		Running command: pip3 install --user --upgrade --force-reinstall --no-deps trainer-0.0.0.tar.gz
INFO	2019-03-27 17:56:22 +0000	master-replica-0		Processing ./trainer-0.0.0.tar.gz
INFO	2019-03-27 17:56:22 +0000	master-replica-0		Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
INFO	2019-03-27 17:56:22 +0000	master-replica-0		Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
INFO	2019-03-27 17:56:23 +0000	master-replica-0		Building wheels for collected packages: trainer
INFO	2019-03-27 17:56:23 +0000	master-replica-0		  Building wheel for trainer (setup.py): started
INFO	2019-03-27 17:56:23 +0000	master-replica-0		Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
INFO	2019-03-27 17:56:23 +0000	master-replica-0		Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
INFO	2019-03-27 17:56:23 +0000	master-replica-0		  Building wheel for trainer (setup.py): finished with status 'done'
INFO	2019-03-27 17:56:23 +0000	master-replica-0		  Stored in directory: /root/.cache/pip/wheels/0d/1b/db/f8e86b296734f0b137e17e5d34862f4ae4faf8388755c6272f
INFO	2019-03-27 17:56:23 +0000	master-replica-0		Successfully built trainer
INFO	2019-03-27 17:56:23 +0000	master-replica-0		Installing collected packages: trainer
INFO	2019-03-27 17:56:23 +0000	master-replica-0		Successfully installed trainer-0.0.0
INFO	2019-03-27 17:56:23 +0000	master-replica-0		Running command: pip3 install --user trainer-0.0.0.tar.gz
INFO	2019-03-27 17:56:24 +0000	master-replica-0		Processing ./trainer-0.0.0.tar.gz
INFO	2019-03-27 17:56:24 +0000	master-replica-0		Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
INFO	2019-03-27 17:56:24 +0000	master-replica-0		Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
INFO	2019-03-27 17:56:24 +0000	master-replica-0		Building wheels for collected packages: trainer
INFO	2019-03-27 17:56:24 +0000	master-replica-0		  Building wheel for trainer (setup.py): started
INFO	2019-03-27 17:56:25 +0000	master-replica-0		Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
INFO	2019-03-27 17:56:25 +0000	master-replica-0		Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
INFO	2019-03-27 17:56:25 +0000	master-replica-0		  Building wheel for trainer (setup.py): finished with status 'done'
INFO	2019-03-27 17:56:25 +0000	master-replica-0		  Stored in directory: /root/.cache/pip/wheels/0d/1b/db/f8e86b296734f0b137e17e5d34862f4ae4faf8388755c6272f
INFO	2019-03-27 17:56:25 +0000	master-replica-0		Successfully built trainer
INFO	2019-03-27 17:56:29 +0000	master-replica-0		Installing collected packages: trainer
INFO	2019-03-27 17:56:29 +0000	master-replica-0		  Found existing installation: trainer 0.0.0
INFO	2019-03-27 17:56:29 +0000	master-replica-0		    Uninstalling trainer-0.0.0:
INFO	2019-03-27 17:56:29 +0000	master-replica-0		      Successfully uninstalled trainer-0.0.0
INFO	2019-03-27 17:56:29 +0000	master-replica-0		Successfully installed trainer-0.0.0
INFO	2019-03-27 17:56:29 +0000	master-replica-0		Running command: python3 -m trainer.task --job-dir gs://<your-bucket-name>/keras-job-dir
WARNING	2019-03-27 17:56:43 +0000	master-replica-0		From /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
WARNING	2019-03-27 17:56:43 +0000	master-replica-0		Instructions for updating:
WARNING	2019-03-27 17:56:43 +0000	master-replica-0		Colocations handled automatically by placer.
INFO	2019-03-27 17:56:44 +0000	master-replica-0		Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
INFO	2019-03-27 17:56:44 +0000	master-replica-0		CPU Frequency: 2300000000 Hz
INFO	2019-03-27 17:56:44 +0000	master-replica-0		XLA service 0x4f15c40 executing computations on platform Host. Devices:
INFO	2019-03-27 17:56:44 +0000	master-replica-0		  StreamExecutor device (0): <undefined>, <undefined>
WARNING	2019-03-27 17:56:44 +0000	master-replica-0		From /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
WARNING	2019-03-27 17:56:44 +0000	master-replica-0		Instructions for updating:
WARNING	2019-03-27 17:56:44 +0000	master-replica-0		Use tf.cast instead.
INFO	2019-03-27 17:56:45 +0000	master-replica-0		Epoch 00001: LearningRateScheduler reducing learning rate to 0.02.
INFO	2019-03-27 17:56:45 +0000	master-replica-0		Epoch 1/20
INFO	2019-03-27 17:56:45 +0000	master-replica-0		  1/254 [..............................] - ETA: 1:06 - loss: 0.5855 - acc: 0.789
INFO	2019-03-27 17:56:45 +0000	master-replica-0		 16/254 [>.............................] - ETA: 4s - loss: 3.8615 - acc: 0.7495 
INFO	2019-03-27 17:56:45 +0000	master-replica-0		 34/254 [===>..........................] - ETA: 2s - loss: 3.1560 - acc: 0.741
INFO	2019-03-27 17:56:45 +0000	master-replica-0		 52/254 [=====>........................] - ETA: 1s - loss: 2.2601 - acc: 0.751
INFO	2019-03-27 17:56:45 +0000	master-replica-0		 69/254 [=======>......................] - ETA: 1s - loss: 1.8414 - acc: 0.750
INFO	2019-03-27 17:56:45 +0000	master-replica-0		 87/254 [=========>....................] - ETA: 0s - loss: 1.5580 - acc: 0.754
INFO	2019-03-27 17:56:45 +0000	master-replica-0		105/254 [===========>..................] - ETA: 0s - loss: 1.3674 - acc: 0.761
INFO	2019-03-27 17:56:45 +0000	master-replica-0		121/254 [=============>................] - ETA: 0s - loss: 1.2418 - acc: 0.767
INFO	2019-03-27 17:56:45 +0000	master-replica-0		141/254 [===============>..............] - ETA: 0s - loss: 1.1292 - acc: 0.770
INFO	2019-03-27 17:56:45 +0000	master-replica-0		161/254 [==================>...........] - ETA: 0s - loss: 1.0420 - acc: 0.774
INFO	2019-03-27 17:56:45 +0000	master-replica-0		179/254 [====================>.........] - ETA: 0s - loss: 0.9786 - acc: 0.778
INFO	2019-03-27 17:56:45 +0000	master-replica-0		198/254 [======================>.......] - ETA: 0s - loss: 0.9222 - acc: 0.783
INFO	2019-03-27 17:56:45 +0000	master-replica-0		218/254 [========================>.....] - ETA: 0s - loss: 0.8751 - acc: 0.786
INFO	2019-03-27 17:56:45 +0000	master-replica-0		238/254 [===========================>..] - ETA: 0s - loss: 0.8347 - acc: 0.788
INFO	2019-03-27 17:56:45 +0000	master-replica-0		254/254 [==============================] - 1s 4ms/step - loss: 0.8076 - acc: 0.7896 - val_loss: 0.4046 - val_acc: 0.8322
INFO	2019-03-27 17:56:45 +0000	master-replica-0		Epoch 00002: LearningRateScheduler reducing learning rate to 0.015.
INFO	2019-03-27 17:56:45 +0000	master-replica-0		Epoch 2/20
INFO	2019-03-27 17:56:45 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.3897 - acc: 0.843
INFO	2019-03-27 17:56:45 +0000	master-replica-0		 21/254 [=>............................] - ETA: 0s - loss: 0.3849 - acc: 0.830
INFO	2019-03-27 17:56:46 +0000	master-replica-0		 40/254 [===>..........................] - ETA: 0s - loss: 0.3795 - acc: 0.827
INFO	2019-03-27 17:56:46 +0000	master-replica-0		 61/254 [======>.......................] - ETA: 0s - loss: 0.3706 - acc: 0.826
INFO	2019-03-27 17:56:46 +0000	master-replica-0		 80/254 [========>.....................] - ETA: 0s - loss: 0.3666 - acc: 0.826
INFO	2019-03-27 17:56:46 +0000	master-replica-0		 99/254 [==========>...................] - ETA: 0s - loss: 0.3618 - acc: 0.829
INFO	2019-03-27 17:56:46 +0000	master-replica-0		119/254 [=============>................] - ETA: 0s - loss: 0.3604 - acc: 0.830
INFO	2019-03-27 17:56:46 +0000	master-replica-0		139/254 [===============>..............] - ETA: 0s - loss: 0.3622 - acc: 0.829
INFO	2019-03-27 17:56:46 +0000	master-replica-0		158/254 [=================>............] - ETA: 0s - loss: 0.3593 - acc: 0.831
INFO	2019-03-27 17:56:46 +0000	master-replica-0		177/254 [===================>..........] - ETA: 0s - loss: 0.3601 - acc: 0.831
INFO	2019-03-27 17:56:46 +0000	master-replica-0		197/254 [======================>.......] - ETA: 0s - loss: 0.3598 - acc: 0.830
INFO	2019-03-27 17:56:46 +0000	master-replica-0		216/254 [========================>.....] - ETA: 0s - loss: 0.3599 - acc: 0.831
INFO	2019-03-27 17:56:46 +0000	master-replica-0		237/254 [==========================>...] - ETA: 0s - loss: 0.3597 - acc: 0.831
INFO	2019-03-27 17:56:46 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3580 - acc: 0.8321 - val_loss: 0.3400 - val_acc: 0.8372
INFO	2019-03-27 17:56:46 +0000	master-replica-0		Epoch 00003: LearningRateScheduler reducing learning rate to 0.0125.
INFO	2019-03-27 17:56:46 +0000	master-replica-0		Epoch 3/20
INFO	2019-03-27 17:56:46 +0000	master-replica-0		  1/254 [..............................] - ETA: 1s - loss: 0.3455 - acc: 0.820
INFO	2019-03-27 17:56:46 +0000	master-replica-0		 17/254 [=>............................] - ETA: 0s - loss: 0.3449 - acc: 0.841
INFO	2019-03-27 17:56:46 +0000	master-replica-0		 36/254 [===>..........................] - ETA: 0s - loss: 0.3414 - acc: 0.841
INFO	2019-03-27 17:56:46 +0000	master-replica-0		 55/254 [=====>........................] - ETA: 0s - loss: 0.3324 - acc: 0.844
INFO	2019-03-27 17:56:46 +0000	master-replica-0		 74/254 [=======>......................] - ETA: 0s - loss: 0.3378 - acc: 0.841
INFO	2019-03-27 17:56:46 +0000	master-replica-0		 91/254 [=========>....................] - ETA: 0s - loss: 0.3372 - acc: 0.842
INFO	2019-03-27 17:56:46 +0000	master-replica-0		107/254 [===========>..................] - ETA: 0s - loss: 0.3394 - acc: 0.842
INFO	2019-03-27 17:56:47 +0000	master-replica-0		122/254 [=============>................] - ETA: 0s - loss: 0.3433 - acc: 0.839
INFO	2019-03-27 17:56:47 +0000	master-replica-0		138/254 [===============>..............] - ETA: 0s - loss: 0.3426 - acc: 0.839
INFO	2019-03-27 17:56:47 +0000	master-replica-0		158/254 [=================>............] - ETA: 0s - loss: 0.3421 - acc: 0.841
INFO	2019-03-27 17:56:47 +0000	master-replica-0		175/254 [===================>..........] - ETA: 0s - loss: 0.3427 - acc: 0.841
INFO	2019-03-27 17:56:47 +0000	master-replica-0		192/254 [=====================>........] - ETA: 0s - loss: 0.3417 - acc: 0.841
INFO	2019-03-27 17:56:47 +0000	master-replica-0		209/254 [=======================>......] - ETA: 0s - loss: 0.3405 - acc: 0.842
INFO	2019-03-27 17:56:47 +0000	master-replica-0		226/254 [=========================>....] - ETA: 0s - loss: 0.3409 - acc: 0.842
INFO	2019-03-27 17:56:47 +0000	master-replica-0		247/254 [============================>.] - ETA: 0s - loss: 0.3412 - acc: 0.842
INFO	2019-03-27 17:56:47 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3409 - acc: 0.8425 - val_loss: 0.3308 - val_acc: 0.8496
INFO	2019-03-27 17:56:47 +0000	master-replica-0		Epoch 00004: LearningRateScheduler reducing learning rate to 0.01125.
INFO	2019-03-27 17:56:47 +0000	master-replica-0		Epoch 4/20
INFO	2019-03-27 17:56:47 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.4057 - acc: 0.820
INFO	2019-03-27 17:56:47 +0000	master-replica-0		 17/254 [=>............................] - ETA: 0s - loss: 0.3522 - acc: 0.832
INFO	2019-03-27 17:56:47 +0000	master-replica-0		 34/254 [===>..........................] - ETA: 0s - loss: 0.3467 - acc: 0.844
INFO	2019-03-27 17:56:47 +0000	master-replica-0		 54/254 [=====>........................] - ETA: 0s - loss: 0.3441 - acc: 0.845
INFO	2019-03-27 17:56:47 +0000	master-replica-0		 71/254 [=======>......................] - ETA: 0s - loss: 0.3448 - acc: 0.844
INFO	2019-03-27 17:56:47 +0000	master-replica-0		 91/254 [=========>....................] - ETA: 0s - loss: 0.3435 - acc: 0.844
INFO	2019-03-27 17:56:47 +0000	master-replica-0		111/254 [============>.................] - ETA: 0s - loss: 0.3417 - acc: 0.844
INFO	2019-03-27 17:56:47 +0000	master-replica-0		130/254 [==============>...............] - ETA: 0s - loss: 0.3392 - acc: 0.846
INFO	2019-03-27 17:56:47 +0000	master-replica-0		149/254 [================>.............] - ETA: 0s - loss: 0.3386 - acc: 0.846
INFO	2019-03-27 17:56:47 +0000	master-replica-0		167/254 [==================>...........] - ETA: 0s - loss: 0.3370 - acc: 0.845
INFO	2019-03-27 17:56:47 +0000	master-replica-0		188/254 [=====================>........] - ETA: 0s - loss: 0.3361 - acc: 0.845
INFO	2019-03-27 17:56:48 +0000	master-replica-0		208/254 [=======================>......] - ETA: 0s - loss: 0.3349 - acc: 0.845
INFO	2019-03-27 17:56:48 +0000	master-replica-0		228/254 [=========================>....] - ETA: 0s - loss: 0.3364 - acc: 0.845
INFO	2019-03-27 17:56:48 +0000	master-replica-0		245/254 [===========================>..] - ETA: 0s - loss: 0.3367 - acc: 0.845
INFO	2019-03-27 17:56:48 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3366 - acc: 0.8451 - val_loss: 0.3431 - val_acc: 0.8319
INFO	2019-03-27 17:56:48 +0000	master-replica-0		Epoch 00005: LearningRateScheduler reducing learning rate to 0.010625.
INFO	2019-03-27 17:56:48 +0000	master-replica-0		Epoch 5/20
INFO	2019-03-27 17:56:48 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.3805 - acc: 0.828
INFO	2019-03-27 17:56:48 +0000	master-replica-0		 22/254 [=>............................] - ETA: 0s - loss: 0.3307 - acc: 0.854
INFO	2019-03-27 17:56:48 +0000	master-replica-0		 41/254 [===>..........................] - ETA: 0s - loss: 0.3297 - acc: 0.847
INFO	2019-03-27 17:56:48 +0000	master-replica-0		 59/254 [=====>........................] - ETA: 0s - loss: 0.3334 - acc: 0.845
INFO	2019-03-27 17:56:48 +0000	master-replica-0		 74/254 [=======>......................] - ETA: 0s - loss: 0.3299 - acc: 0.850
INFO	2019-03-27 17:56:48 +0000	master-replica-0		 87/254 [=========>....................] - ETA: 0s - loss: 0.3323 - acc: 0.849
INFO	2019-03-27 17:56:48 +0000	master-replica-0		 97/254 [==========>...................] - ETA: 0s - loss: 0.3336 - acc: 0.849
INFO	2019-03-27 17:56:48 +0000	master-replica-0		107/254 [===========>..................] - ETA: 0s - loss: 0.3344 - acc: 0.849
INFO	2019-03-27 17:56:48 +0000	master-replica-0		120/254 [=============>................] - ETA: 0s - loss: 0.3367 - acc: 0.847
INFO	2019-03-27 17:56:48 +0000	master-replica-0		136/254 [===============>..............] - ETA: 0s - loss: 0.3368 - acc: 0.846
INFO	2019-03-27 17:56:48 +0000	master-replica-0		151/254 [================>.............] - ETA: 0s - loss: 0.3359 - acc: 0.846
INFO	2019-03-27 17:56:48 +0000	master-replica-0		169/254 [==================>...........] - ETA: 0s - loss: 0.3373 - acc: 0.844
INFO	2019-03-27 17:56:48 +0000	master-replica-0		187/254 [=====================>........] - ETA: 0s - loss: 0.3364 - acc: 0.845
INFO	2019-03-27 17:56:48 +0000	master-replica-0		205/254 [=======================>......] - ETA: 0s - loss: 0.3348 - acc: 0.844
INFO	2019-03-27 17:56:48 +0000	master-replica-0		223/254 [=========================>....] - ETA: 0s - loss: 0.3345 - acc: 0.845
INFO	2019-03-27 17:56:49 +0000	master-replica-0		242/254 [===========================>..] - ETA: 0s - loss: 0.3353 - acc: 0.844
INFO	2019-03-27 17:56:49 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3339 - acc: 0.8453 - val_loss: 0.3486 - val_acc: 0.8504
INFO	2019-03-27 17:56:49 +0000	master-replica-0		Epoch 00006: LearningRateScheduler reducing learning rate to 0.0103125.
INFO	2019-03-27 17:56:49 +0000	master-replica-0		Epoch 6/20
INFO	2019-03-27 17:56:49 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.5068 - acc: 0.789
INFO	2019-03-27 17:56:49 +0000	master-replica-0		 18/254 [=>............................] - ETA: 0s - loss: 0.3490 - acc: 0.845
INFO	2019-03-27 17:56:49 +0000	master-replica-0		 38/254 [===>..........................] - ETA: 0s - loss: 0.3355 - acc: 0.851
INFO	2019-03-27 17:56:49 +0000	master-replica-0		 58/254 [=====>........................] - ETA: 0s - loss: 0.3312 - acc: 0.852
INFO	2019-03-27 17:56:49 +0000	master-replica-0		 78/254 [========>.....................] - ETA: 0s - loss: 0.3311 - acc: 0.852
INFO	2019-03-27 17:56:49 +0000	master-replica-0		 99/254 [==========>...................] - ETA: 0s - loss: 0.3317 - acc: 0.850
INFO	2019-03-27 17:56:49 +0000	master-replica-0		116/254 [============>.................] - ETA: 0s - loss: 0.3312 - acc: 0.849
INFO	2019-03-27 17:56:49 +0000	master-replica-0		134/254 [==============>...............] - ETA: 0s - loss: 0.3326 - acc: 0.850
INFO	2019-03-27 17:56:49 +0000	master-replica-0		153/254 [=================>............] - ETA: 0s - loss: 0.3326 - acc: 0.849
INFO	2019-03-27 17:56:49 +0000	master-replica-0		169/254 [==================>...........] - ETA: 0s - loss: 0.3351 - acc: 0.848
INFO	2019-03-27 17:56:49 +0000	master-replica-0		187/254 [=====================>........] - ETA: 0s - loss: 0.3351 - acc: 0.848
INFO	2019-03-27 17:56:49 +0000	master-replica-0		204/254 [=======================>......] - ETA: 0s - loss: 0.3346 - acc: 0.847
INFO	2019-03-27 17:56:49 +0000	master-replica-0		220/254 [========================>.....] - ETA: 0s - loss: 0.3345 - acc: 0.848
INFO	2019-03-27 17:56:49 +0000	master-replica-0		237/254 [==========================>...] - ETA: 0s - loss: 0.3330 - acc: 0.848
INFO	2019-03-27 17:56:49 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3317 - acc: 0.8483 - val_loss: 0.3297 - val_acc: 0.8402
INFO	2019-03-27 17:56:49 +0000	master-replica-0		Epoch 00007: LearningRateScheduler reducing learning rate to 0.01015625.
INFO	2019-03-27 17:56:49 +0000	master-replica-0		Epoch 7/20
INFO	2019-03-27 17:56:49 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.2934 - acc: 0.890
INFO	2019-03-27 17:56:49 +0000	master-replica-0		 19/254 [=>............................] - ETA: 0s - loss: 0.3127 - acc: 0.858
INFO	2019-03-27 17:56:49 +0000	master-replica-0		 37/254 [===>..........................] - ETA: 0s - loss: 0.3242 - acc: 0.855
INFO	2019-03-27 17:56:50 +0000	master-replica-0		 55/254 [=====>........................] - ETA: 0s - loss: 0.3300 - acc: 0.848
INFO	2019-03-27 17:56:50 +0000	master-replica-0		 74/254 [=======>......................] - ETA: 0s - loss: 0.3345 - acc: 0.848
INFO	2019-03-27 17:56:50 +0000	master-replica-0		 93/254 [=========>....................] - ETA: 0s - loss: 0.3329 - acc: 0.846
INFO	2019-03-27 17:56:50 +0000	master-replica-0		111/254 [============>.................] - ETA: 0s - loss: 0.3327 - acc: 0.846
INFO	2019-03-27 17:56:50 +0000	master-replica-0		128/254 [==============>...............] - ETA: 0s - loss: 0.3327 - acc: 0.846
INFO	2019-03-27 17:56:50 +0000	master-replica-0		147/254 [================>.............] - ETA: 0s - loss: 0.3307 - acc: 0.847
INFO	2019-03-27 17:56:50 +0000	master-replica-0		163/254 [==================>...........] - ETA: 0s - loss: 0.3310 - acc: 0.848
INFO	2019-03-27 17:56:50 +0000	master-replica-0		183/254 [====================>.........] - ETA: 0s - loss: 0.3316 - acc: 0.848
INFO	2019-03-27 17:56:50 +0000	master-replica-0		202/254 [======================>.......] - ETA: 0s - loss: 0.3335 - acc: 0.847
INFO	2019-03-27 17:56:50 +0000	master-replica-0		219/254 [========================>.....] - ETA: 0s - loss: 0.3312 - acc: 0.849
INFO	2019-03-27 17:56:50 +0000	master-replica-0		238/254 [===========================>..] - ETA: 0s - loss: 0.3312 - acc: 0.848
INFO	2019-03-27 17:56:50 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3316 - acc: 0.8480 - val_loss: 0.3250 - val_acc: 0.8461
INFO	2019-03-27 17:56:50 +0000	master-replica-0		Epoch 00008: LearningRateScheduler reducing learning rate to 0.010078125.
INFO	2019-03-27 17:56:50 +0000	master-replica-0		Epoch 8/20
INFO	2019-03-27 17:56:50 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.3214 - acc: 0.859
INFO	2019-03-27 17:56:50 +0000	master-replica-0		 19/254 [=>............................] - ETA: 0s - loss: 0.3520 - acc: 0.842
INFO	2019-03-27 17:56:50 +0000	master-replica-0		 34/254 [===>..........................] - ETA: 0s - loss: 0.3456 - acc: 0.844
INFO	2019-03-27 17:56:50 +0000	master-replica-0		 49/254 [====>.........................] - ETA: 0s - loss: 0.3347 - acc: 0.849
INFO	2019-03-27 17:56:50 +0000	master-replica-0		 64/254 [======>.......................] - ETA: 0s - loss: 0.3296 - acc: 0.850
INFO	2019-03-27 17:56:50 +0000	master-replica-0		 79/254 [========>.....................] - ETA: 0s - loss: 0.3293 - acc: 0.850
INFO	2019-03-27 17:56:50 +0000	master-replica-0		 95/254 [==========>...................] - ETA: 0s - loss: 0.3288 - acc: 0.850
INFO	2019-03-27 17:56:50 +0000	master-replica-0		112/254 [============>.................] - ETA: 0s - loss: 0.3285 - acc: 0.848
INFO	2019-03-27 17:56:51 +0000	master-replica-0		128/254 [==============>...............] - ETA: 0s - loss: 0.3307 - acc: 0.848
INFO	2019-03-27 17:56:51 +0000	master-replica-0		144/254 [================>.............] - ETA: 0s - loss: 0.3294 - acc: 0.848
INFO	2019-03-27 17:56:51 +0000	master-replica-0		161/254 [==================>...........] - ETA: 0s - loss: 0.3294 - acc: 0.848
INFO	2019-03-27 17:56:51 +0000	master-replica-0		175/254 [===================>..........] - ETA: 0s - loss: 0.3289 - acc: 0.849
INFO	2019-03-27 17:56:51 +0000	master-replica-0		190/254 [=====================>........] - ETA: 0s - loss: 0.3308 - acc: 0.848
INFO	2019-03-27 17:56:51 +0000	master-replica-0		206/254 [=======================>......] - ETA: 0s - loss: 0.3302 - acc: 0.849
INFO	2019-03-27 17:56:51 +0000	master-replica-0		221/254 [=========================>....] - ETA: 0s - loss: 0.3310 - acc: 0.849
INFO	2019-03-27 17:56:51 +0000	master-replica-0		237/254 [==========================>...] - ETA: 0s - loss: 0.3327 - acc: 0.848
INFO	2019-03-27 17:56:51 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3313 - acc: 0.8483 - val_loss: 0.3255 - val_acc: 0.8500
INFO	2019-03-27 17:56:51 +0000	master-replica-0		Epoch 00009: LearningRateScheduler reducing learning rate to 0.0100390625.
INFO	2019-03-27 17:56:51 +0000	master-replica-0		Epoch 9/20
INFO	2019-03-27 17:56:51 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.4250 - acc: 0.796
INFO	2019-03-27 17:56:51 +0000	master-replica-0		 20/254 [=>............................] - ETA: 0s - loss: 0.3384 - acc: 0.850
INFO	2019-03-27 17:56:51 +0000	master-replica-0		 35/254 [===>..........................] - ETA: 0s - loss: 0.3326 - acc: 0.848
INFO	2019-03-27 17:56:51 +0000	master-replica-0		 51/254 [=====>........................] - ETA: 0s - loss: 0.3307 - acc: 0.850
INFO	2019-03-27 17:56:51 +0000	master-replica-0		 68/254 [=======>......................] - ETA: 0s - loss: 0.3312 - acc: 0.851
INFO	2019-03-27 17:56:51 +0000	master-replica-0		 85/254 [=========>....................] - ETA: 0s - loss: 0.3337 - acc: 0.850
INFO	2019-03-27 17:56:51 +0000	master-replica-0		102/254 [===========>..................] - ETA: 0s - loss: 0.3346 - acc: 0.849
INFO	2019-03-27 17:56:51 +0000	master-replica-0		118/254 [============>.................] - ETA: 0s - loss: 0.3358 - acc: 0.848
INFO	2019-03-27 17:56:51 +0000	master-replica-0		134/254 [==============>...............] - ETA: 0s - loss: 0.3320 - acc: 0.849
INFO	2019-03-27 17:56:51 +0000	master-replica-0		150/254 [================>.............] - ETA: 0s - loss: 0.3319 - acc: 0.848
INFO	2019-03-27 17:56:52 +0000	master-replica-0		168/254 [==================>...........] - ETA: 0s - loss: 0.3337 - acc: 0.849
INFO	2019-03-27 17:56:52 +0000	master-replica-0		187/254 [=====================>........] - ETA: 0s - loss: 0.3332 - acc: 0.848
INFO	2019-03-27 17:56:52 +0000	master-replica-0		206/254 [=======================>......] - ETA: 0s - loss: 0.3315 - acc: 0.848
INFO	2019-03-27 17:56:52 +0000	master-replica-0		224/254 [=========================>....] - ETA: 0s - loss: 0.3311 - acc: 0.849
INFO	2019-03-27 17:56:52 +0000	master-replica-0		242/254 [===========================>..] - ETA: 0s - loss: 0.3304 - acc: 0.848
INFO	2019-03-27 17:56:52 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3314 - acc: 0.8485 - val_loss: 0.3236 - val_acc: 0.8520
INFO	2019-03-27 17:56:52 +0000	master-replica-0		Epoch 00010: LearningRateScheduler reducing learning rate to 0.01001953125.
INFO	2019-03-27 17:56:52 +0000	master-replica-0		Epoch 10/20
INFO	2019-03-27 17:56:52 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.2766 - acc: 0.890
INFO	2019-03-27 17:56:52 +0000	master-replica-0		 18/254 [=>............................] - ETA: 0s - loss: 0.3305 - acc: 0.850
INFO	2019-03-27 17:56:52 +0000	master-replica-0		 33/254 [==>...........................] - ETA: 0s - loss: 0.3337 - acc: 0.848
INFO	2019-03-27 17:56:52 +0000	master-replica-0		 48/254 [====>.........................] - ETA: 0s - loss: 0.3414 - acc: 0.843
INFO	2019-03-27 17:56:52 +0000	master-replica-0		 62/254 [======>.......................] - ETA: 0s - loss: 0.3370 - acc: 0.844
INFO	2019-03-27 17:56:52 +0000	master-replica-0		 73/254 [=======>......................] - ETA: 0s - loss: 0.3332 - acc: 0.847
INFO	2019-03-27 17:56:52 +0000	master-replica-0		 84/254 [========>.....................] - ETA: 0s - loss: 0.3283 - acc: 0.850
INFO	2019-03-27 17:56:52 +0000	master-replica-0		 94/254 [==========>...................] - ETA: 0s - loss: 0.3290 - acc: 0.850
INFO	2019-03-27 17:56:52 +0000	master-replica-0		105/254 [===========>..................] - ETA: 0s - loss: 0.3292 - acc: 0.851
INFO	2019-03-27 17:56:52 +0000	master-replica-0		119/254 [=============>................] - ETA: 0s - loss: 0.3262 - acc: 0.852
INFO	2019-03-27 17:56:52 +0000	master-replica-0		133/254 [==============>...............] - ETA: 0s - loss: 0.3295 - acc: 0.850
INFO	2019-03-27 17:56:52 +0000	master-replica-0		148/254 [================>.............] - ETA: 0s - loss: 0.3296 - acc: 0.849
INFO	2019-03-27 17:56:52 +0000	master-replica-0		164/254 [==================>...........] - ETA: 0s - loss: 0.3290 - acc: 0.849
INFO	2019-03-27 17:56:53 +0000	master-replica-0		181/254 [====================>.........] - ETA: 0s - loss: 0.3303 - acc: 0.848
INFO	2019-03-27 17:56:53 +0000	master-replica-0		199/254 [======================>.......] - ETA: 0s - loss: 0.3291 - acc: 0.849
INFO	2019-03-27 17:56:53 +0000	master-replica-0		217/254 [========================>.....] - ETA: 0s - loss: 0.3297 - acc: 0.849
INFO	2019-03-27 17:56:53 +0000	master-replica-0		236/254 [==========================>...] - ETA: 0s - loss: 0.3295 - acc: 0.849
INFO	2019-03-27 17:56:53 +0000	master-replica-0		254/254 [==============================] - 1s 4ms/step - loss: 0.3291 - acc: 0.8494 - val_loss: 0.3264 - val_acc: 0.8516
INFO	2019-03-27 17:56:53 +0000	master-replica-0		Epoch 00011: LearningRateScheduler reducing learning rate to 0.010009765625.
INFO	2019-03-27 17:56:53 +0000	master-replica-0		Epoch 11/20
INFO	2019-03-27 17:56:53 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.3474 - acc: 0.859
INFO	2019-03-27 17:56:53 +0000	master-replica-0		 18/254 [=>............................] - ETA: 0s - loss: 0.3394 - acc: 0.857
INFO	2019-03-27 17:56:53 +0000	master-replica-0		 37/254 [===>..........................] - ETA: 0s - loss: 0.3287 - acc: 0.855
INFO	2019-03-27 17:56:53 +0000	master-replica-0		 56/254 [=====>........................] - ETA: 0s - loss: 0.3220 - acc: 0.855
INFO	2019-03-27 17:56:53 +0000	master-replica-0		 74/254 [=======>......................] - ETA: 0s - loss: 0.3257 - acc: 0.854
INFO	2019-03-27 17:56:53 +0000	master-replica-0		 92/254 [=========>....................] - ETA: 0s - loss: 0.3259 - acc: 0.852
INFO	2019-03-27 17:56:53 +0000	master-replica-0		111/254 [============>.................] - ETA: 0s - loss: 0.3267 - acc: 0.850
INFO	2019-03-27 17:56:53 +0000	master-replica-0		130/254 [==============>...............] - ETA: 0s - loss: 0.3266 - acc: 0.849
INFO	2019-03-27 17:56:53 +0000	master-replica-0		147/254 [================>.............] - ETA: 0s - loss: 0.3260 - acc: 0.851
INFO	2019-03-27 17:56:53 +0000	master-replica-0		166/254 [==================>...........] - ETA: 0s - loss: 0.3253 - acc: 0.850
INFO	2019-03-27 17:56:53 +0000	master-replica-0		185/254 [====================>.........] - ETA: 0s - loss: 0.3249 - acc: 0.850
INFO	2019-03-27 17:56:53 +0000	master-replica-0		204/254 [=======================>......] - ETA: 0s - loss: 0.3257 - acc: 0.849
INFO	2019-03-27 17:56:53 +0000	master-replica-0		221/254 [=========================>....] - ETA: 0s - loss: 0.3275 - acc: 0.849
INFO	2019-03-27 17:56:53 +0000	master-replica-0		241/254 [===========================>..] - ETA: 0s - loss: 0.3264 - acc: 0.849
INFO	2019-03-27 17:56:53 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3270 - acc: 0.8494 - val_loss: 0.3246 - val_acc: 0.8499
INFO	2019-03-27 17:56:53 +0000	master-replica-0		Epoch 00012: LearningRateScheduler reducing learning rate to 0.010004882812500001.
INFO	2019-03-27 17:56:53 +0000	master-replica-0		Epoch 12/20
INFO	2019-03-27 17:56:54 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.3179 - acc: 0.851
INFO	2019-03-27 17:56:54 +0000	master-replica-0		 19/254 [=>............................] - ETA: 0s - loss: 0.3353 - acc: 0.843
INFO	2019-03-27 17:56:54 +0000	master-replica-0		 39/254 [===>..........................] - ETA: 0s - loss: 0.3319 - acc: 0.848
INFO	2019-03-27 17:56:54 +0000	master-replica-0		 59/254 [=====>........................] - ETA: 0s - loss: 0.3268 - acc: 0.854
INFO	2019-03-27 17:56:54 +0000	master-replica-0		 79/254 [========>.....................] - ETA: 0s - loss: 0.3309 - acc: 0.849
INFO	2019-03-27 17:56:54 +0000	master-replica-0		 99/254 [==========>...................] - ETA: 0s - loss: 0.3315 - acc: 0.850
INFO	2019-03-27 17:56:54 +0000	master-replica-0		119/254 [=============>................] - ETA: 0s - loss: 0.3311 - acc: 0.851
INFO	2019-03-27 17:56:54 +0000	master-replica-0		135/254 [==============>...............] - ETA: 0s - loss: 0.3284 - acc: 0.852
INFO	2019-03-27 17:56:54 +0000	master-replica-0		153/254 [=================>............] - ETA: 0s - loss: 0.3270 - acc: 0.851
INFO	2019-03-27 17:56:54 +0000	master-replica-0		172/254 [===================>..........] - ETA: 0s - loss: 0.3263 - acc: 0.851
INFO	2019-03-27 17:56:54 +0000	master-replica-0		190/254 [=====================>........] - ETA: 0s - loss: 0.3263 - acc: 0.851
INFO	2019-03-27 17:56:54 +0000	master-replica-0		206/254 [=======================>......] - ETA: 0s - loss: 0.3271 - acc: 0.850
INFO	2019-03-27 17:56:54 +0000	master-replica-0		225/254 [=========================>....] - ETA: 0s - loss: 0.3274 - acc: 0.850
INFO	2019-03-27 17:56:54 +0000	master-replica-0		242/254 [===========================>..] - ETA: 0s - loss: 0.3271 - acc: 0.850
INFO	2019-03-27 17:56:54 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3276 - acc: 0.8500 - val_loss: 0.3452 - val_acc: 0.8444
INFO	2019-03-27 17:56:54 +0000	master-replica-0		Epoch 00013: LearningRateScheduler reducing learning rate to 0.01000244140625.
INFO	2019-03-27 17:56:54 +0000	master-replica-0		Epoch 13/20
INFO	2019-03-27 17:56:54 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.3248 - acc: 0.851
INFO	2019-03-27 17:56:54 +0000	master-replica-0		 19/254 [=>............................] - ETA: 0s - loss: 0.3278 - acc: 0.837
INFO	2019-03-27 17:56:54 +0000	master-replica-0		 38/254 [===>..........................] - ETA: 0s - loss: 0.3317 - acc: 0.842
INFO	2019-03-27 17:56:54 +0000	master-replica-0		 53/254 [=====>........................] - ETA: 0s - loss: 0.3286 - acc: 0.846
INFO	2019-03-27 17:56:55 +0000	master-replica-0		 71/254 [=======>......................] - ETA: 0s - loss: 0.3204 - acc: 0.852
INFO	2019-03-27 17:56:55 +0000	master-replica-0		 88/254 [=========>....................] - ETA: 0s - loss: 0.3232 - acc: 0.851
INFO	2019-03-27 17:56:55 +0000	master-replica-0		105/254 [===========>..................] - ETA: 0s - loss: 0.3271 - acc: 0.851
INFO	2019-03-27 17:56:55 +0000	master-replica-0		124/254 [=============>................] - ETA: 0s - loss: 0.3308 - acc: 0.848
INFO	2019-03-27 17:56:55 +0000	master-replica-0		141/254 [===============>..............] - ETA: 0s - loss: 0.3316 - acc: 0.848
INFO	2019-03-27 17:56:55 +0000	master-replica-0		157/254 [=================>............] - ETA: 0s - loss: 0.3294 - acc: 0.848
INFO	2019-03-27 17:56:55 +0000	master-replica-0		171/254 [===================>..........] - ETA: 0s - loss: 0.3298 - acc: 0.848
INFO	2019-03-27 17:56:55 +0000	master-replica-0		189/254 [=====================>........] - ETA: 0s - loss: 0.3304 - acc: 0.848
INFO	2019-03-27 17:56:55 +0000	master-replica-0		206/254 [=======================>......] - ETA: 0s - loss: 0.3295 - acc: 0.849
INFO	2019-03-27 17:56:55 +0000	master-replica-0		225/254 [=========================>....] - ETA: 0s - loss: 0.3315 - acc: 0.847
INFO	2019-03-27 17:56:55 +0000	master-replica-0		242/254 [===========================>..] - ETA: 0s - loss: 0.3305 - acc: 0.848
INFO	2019-03-27 17:56:55 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3301 - acc: 0.8485 - val_loss: 0.3439 - val_acc: 0.8439
INFO	2019-03-27 17:56:55 +0000	master-replica-0		Epoch 00014: LearningRateScheduler reducing learning rate to 0.010001220703125.
INFO	2019-03-27 17:56:55 +0000	master-replica-0		Epoch 14/20
INFO	2019-03-27 17:56:55 +0000	master-replica-0		  1/254 [..............................] - ETA: 1s - loss: 0.3098 - acc: 0.890
INFO	2019-03-27 17:56:55 +0000	master-replica-0		 16/254 [>.............................] - ETA: 0s - loss: 0.3104 - acc: 0.855
INFO	2019-03-27 17:56:55 +0000	master-replica-0		 33/254 [==>...........................] - ETA: 0s - loss: 0.3184 - acc: 0.852
INFO	2019-03-27 17:56:55 +0000	master-replica-0		 49/254 [====>.........................] - ETA: 0s - loss: 0.3204 - acc: 0.852
INFO	2019-03-27 17:56:55 +0000	master-replica-0		 64/254 [======>.......................] - ETA: 0s - loss: 0.3218 - acc: 0.851
INFO	2019-03-27 17:56:55 +0000	master-replica-0		 80/254 [========>.....................] - ETA: 0s - loss: 0.3252 - acc: 0.848
INFO	2019-03-27 17:56:55 +0000	master-replica-0		 96/254 [==========>...................] - ETA: 0s - loss: 0.3242 - acc: 0.847
INFO	2019-03-27 17:56:56 +0000	master-replica-0		112/254 [============>.................] - ETA: 0s - loss: 0.3230 - acc: 0.848
INFO	2019-03-27 17:56:56 +0000	master-replica-0		129/254 [==============>...............] - ETA: 0s - loss: 0.3256 - acc: 0.848
INFO	2019-03-27 17:56:56 +0000	master-replica-0		144/254 [================>.............] - ETA: 0s - loss: 0.3255 - acc: 0.848
INFO	2019-03-27 17:56:56 +0000	master-replica-0		162/254 [==================>...........] - ETA: 0s - loss: 0.3251 - acc: 0.847
INFO	2019-03-27 17:56:56 +0000	master-replica-0		177/254 [===================>..........] - ETA: 0s - loss: 0.3249 - acc: 0.848
INFO	2019-03-27 17:56:56 +0000	master-replica-0		195/254 [======================>.......] - ETA: 0s - loss: 0.3263 - acc: 0.848
INFO	2019-03-27 17:56:56 +0000	master-replica-0		211/254 [=======================>......] - ETA: 0s - loss: 0.3261 - acc: 0.848
INFO	2019-03-27 17:56:56 +0000	master-replica-0		227/254 [=========================>....] - ETA: 0s - loss: 0.3257 - acc: 0.849
INFO	2019-03-27 17:56:56 +0000	master-replica-0		242/254 [===========================>..] - ETA: 0s - loss: 0.3262 - acc: 0.850
INFO	2019-03-27 17:56:56 +0000	master-replica-0		254/254 [==============================] - 1s 4ms/step - loss: 0.3265 - acc: 0.8503 - val_loss: 0.3399 - val_acc: 0.8413
INFO	2019-03-27 17:56:56 +0000	master-replica-0		Epoch 00015: LearningRateScheduler reducing learning rate to 0.0100006103515625.
INFO	2019-03-27 17:56:56 +0000	master-replica-0		Epoch 15/20
INFO	2019-03-27 17:56:56 +0000	master-replica-0		  1/254 [..............................] - ETA: 1s - loss: 0.2755 - acc: 0.859
INFO	2019-03-27 17:56:56 +0000	master-replica-0		 14/254 [>.............................] - ETA: 0s - loss: 0.3182 - acc: 0.848
INFO	2019-03-27 17:56:56 +0000	master-replica-0		 27/254 [==>...........................] - ETA: 0s - loss: 0.3285 - acc: 0.846
INFO	2019-03-27 17:56:56 +0000	master-replica-0		 39/254 [===>..........................] - ETA: 0s - loss: 0.3268 - acc: 0.849
INFO	2019-03-27 17:56:56 +0000	master-replica-0		 47/254 [====>.........................] - ETA: 0s - loss: 0.3321 - acc: 0.849
INFO	2019-03-27 17:56:56 +0000	master-replica-0		 59/254 [=====>........................] - ETA: 0s - loss: 0.3231 - acc: 0.853
INFO	2019-03-27 17:56:56 +0000	master-replica-0		 72/254 [=======>......................] - ETA: 0s - loss: 0.3262 - acc: 0.851
INFO	2019-03-27 17:56:56 +0000	master-replica-0		 86/254 [=========>....................] - ETA: 0s - loss: 0.3273 - acc: 0.848
INFO	2019-03-27 17:56:56 +0000	master-replica-0		101/254 [==========>...................] - ETA: 0s - loss: 0.3294 - acc: 0.847
INFO	2019-03-27 17:56:57 +0000	master-replica-0		117/254 [============>.................] - ETA: 0s - loss: 0.3276 - acc: 0.848
INFO	2019-03-27 17:56:57 +0000	master-replica-0		131/254 [==============>...............] - ETA: 0s - loss: 0.3263 - acc: 0.850
INFO	2019-03-27 17:56:57 +0000	master-replica-0		148/254 [================>.............] - ETA: 0s - loss: 0.3250 - acc: 0.851
INFO	2019-03-27 17:56:57 +0000	master-replica-0		165/254 [==================>...........] - ETA: 0s - loss: 0.3260 - acc: 0.851
INFO	2019-03-27 17:56:57 +0000	master-replica-0		182/254 [====================>.........] - ETA: 0s - loss: 0.3261 - acc: 0.851
INFO	2019-03-27 17:56:57 +0000	master-replica-0		201/254 [======================>.......] - ETA: 0s - loss: 0.3259 - acc: 0.850
INFO	2019-03-27 17:56:57 +0000	master-replica-0		218/254 [========================>.....] - ETA: 0s - loss: 0.3261 - acc: 0.850
INFO	2019-03-27 17:56:57 +0000	master-replica-0		237/254 [==========================>...] - ETA: 0s - loss: 0.3266 - acc: 0.850
INFO	2019-03-27 17:56:57 +0000	master-replica-0		254/254 [==============================] - 1s 4ms/step - loss: 0.3285 - acc: 0.8496 - val_loss: 0.3191 - val_acc: 0.8499
INFO	2019-03-27 17:56:57 +0000	master-replica-0		Epoch 00016: LearningRateScheduler reducing learning rate to 0.01000030517578125.
INFO	2019-03-27 17:56:57 +0000	master-replica-0		Epoch 16/20
INFO	2019-03-27 17:56:57 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.3200 - acc: 0.828
INFO	2019-03-27 17:56:57 +0000	master-replica-0		 20/254 [=>............................] - ETA: 0s - loss: 0.3262 - acc: 0.844
INFO	2019-03-27 17:56:57 +0000	master-replica-0		 40/254 [===>..........................] - ETA: 0s - loss: 0.3369 - acc: 0.843
INFO	2019-03-27 17:56:57 +0000	master-replica-0		 59/254 [=====>........................] - ETA: 0s - loss: 0.3348 - acc: 0.845
INFO	2019-03-27 17:56:57 +0000	master-replica-0		 79/254 [========>.....................] - ETA: 0s - loss: 0.3302 - acc: 0.846
INFO	2019-03-27 17:56:57 +0000	master-replica-0		 98/254 [==========>...................] - ETA: 0s - loss: 0.3292 - acc: 0.847
INFO	2019-03-27 17:56:57 +0000	master-replica-0		117/254 [============>.................] - ETA: 0s - loss: 0.3301 - acc: 0.846
INFO	2019-03-27 17:56:57 +0000	master-replica-0		134/254 [==============>...............] - ETA: 0s - loss: 0.3275 - acc: 0.848
INFO	2019-03-27 17:56:57 +0000	master-replica-0		151/254 [================>.............] - ETA: 0s - loss: 0.3253 - acc: 0.849
INFO	2019-03-27 17:56:57 +0000	master-replica-0		167/254 [==================>...........] - ETA: 0s - loss: 0.3259 - acc: 0.849
INFO	2019-03-27 17:56:57 +0000	master-replica-0		185/254 [====================>.........] - ETA: 0s - loss: 0.3271 - acc: 0.848
INFO	2019-03-27 17:56:58 +0000	master-replica-0		205/254 [=======================>......] - ETA: 0s - loss: 0.3292 - acc: 0.847
INFO	2019-03-27 17:56:58 +0000	master-replica-0		224/254 [=========================>....] - ETA: 0s - loss: 0.3284 - acc: 0.848
INFO	2019-03-27 17:56:58 +0000	master-replica-0		244/254 [===========================>..] - ETA: 0s - loss: 0.3271 - acc: 0.849
INFO	2019-03-27 17:56:58 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3278 - acc: 0.8495 - val_loss: 0.3245 - val_acc: 0.8537
INFO	2019-03-27 17:56:58 +0000	master-replica-0		Epoch 00017: LearningRateScheduler reducing learning rate to 0.010000152587890625.
INFO	2019-03-27 17:56:58 +0000	master-replica-0		Epoch 17/20
INFO	2019-03-27 17:56:58 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.2615 - acc: 0.882
INFO	2019-03-27 17:56:58 +0000	master-replica-0		 19/254 [=>............................] - ETA: 0s - loss: 0.3367 - acc: 0.844
INFO	2019-03-27 17:56:58 +0000	master-replica-0		 37/254 [===>..........................] - ETA: 0s - loss: 0.3328 - acc: 0.846
INFO	2019-03-27 17:56:58 +0000	master-replica-0		 55/254 [=====>........................] - ETA: 0s - loss: 0.3300 - acc: 0.850
INFO	2019-03-27 17:56:58 +0000	master-replica-0		 71/254 [=======>......................] - ETA: 0s - loss: 0.3255 - acc: 0.853
INFO	2019-03-27 17:56:58 +0000	master-replica-0		 88/254 [=========>....................] - ETA: 0s - loss: 0.3239 - acc: 0.853
INFO	2019-03-27 17:56:58 +0000	master-replica-0		105/254 [===========>..................] - ETA: 0s - loss: 0.3245 - acc: 0.851
INFO	2019-03-27 17:56:58 +0000	master-replica-0		122/254 [=============>................] - ETA: 0s - loss: 0.3234 - acc: 0.851
INFO	2019-03-27 17:56:58 +0000	master-replica-0		139/254 [===============>..............] - ETA: 0s - loss: 0.3218 - acc: 0.852
INFO	2019-03-27 17:56:58 +0000	master-replica-0		156/254 [=================>............] - ETA: 0s - loss: 0.3244 - acc: 0.851
INFO	2019-03-27 17:56:58 +0000	master-replica-0		172/254 [===================>..........] - ETA: 0s - loss: 0.3267 - acc: 0.850
INFO	2019-03-27 17:56:58 +0000	master-replica-0		188/254 [=====================>........] - ETA: 0s - loss: 0.3286 - acc: 0.849
INFO	2019-03-27 17:56:58 +0000	master-replica-0		203/254 [======================>.......] - ETA: 0s - loss: 0.3278 - acc: 0.850
INFO	2019-03-27 17:56:58 +0000	master-replica-0		219/254 [========================>.....] - ETA: 0s - loss: 0.3263 - acc: 0.850
INFO	2019-03-27 17:56:59 +0000	master-replica-0		236/254 [==========================>...] - ETA: 0s - loss: 0.3270 - acc: 0.850
INFO	2019-03-27 17:56:59 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3270 - acc: 0.8500 - val_loss: 0.3272 - val_acc: 0.8521
INFO	2019-03-27 17:56:59 +0000	master-replica-0		Epoch 00018: LearningRateScheduler reducing learning rate to 0.010000076293945313.
INFO	2019-03-27 17:56:59 +0000	master-replica-0		Epoch 18/20
INFO	2019-03-27 17:56:59 +0000	master-replica-0		  1/254 [..............................] - ETA: 1s - loss: 0.3422 - acc: 0.828
INFO	2019-03-27 17:56:59 +0000	master-replica-0		 16/254 [>.............................] - ETA: 0s - loss: 0.3106 - acc: 0.850
INFO	2019-03-27 17:56:59 +0000	master-replica-0		 32/254 [==>...........................] - ETA: 0s - loss: 0.3128 - acc: 0.850
INFO	2019-03-27 17:56:59 +0000	master-replica-0		 47/254 [====>.........................] - ETA: 0s - loss: 0.3246 - acc: 0.847
INFO	2019-03-27 17:56:59 +0000	master-replica-0		 63/254 [======>.......................] - ETA: 0s - loss: 0.3241 - acc: 0.850
INFO	2019-03-27 17:56:59 +0000	master-replica-0		 79/254 [========>.....................] - ETA: 0s - loss: 0.3233 - acc: 0.849
INFO	2019-03-27 17:56:59 +0000	master-replica-0		 95/254 [==========>...................] - ETA: 0s - loss: 0.3282 - acc: 0.847
INFO	2019-03-27 17:56:59 +0000	master-replica-0		111/254 [============>.................] - ETA: 0s - loss: 0.3282 - acc: 0.847
INFO	2019-03-27 17:56:59 +0000	master-replica-0		126/254 [=============>................] - ETA: 0s - loss: 0.3266 - acc: 0.848
INFO	2019-03-27 17:56:59 +0000	master-replica-0		145/254 [================>.............] - ETA: 0s - loss: 0.3273 - acc: 0.848
INFO	2019-03-27 17:56:59 +0000	master-replica-0		163/254 [==================>...........] - ETA: 0s - loss: 0.3287 - acc: 0.847
INFO	2019-03-27 17:56:59 +0000	master-replica-0		178/254 [====================>.........] - ETA: 0s - loss: 0.3267 - acc: 0.848
INFO	2019-03-27 17:56:59 +0000	master-replica-0		194/254 [=====================>........] - ETA: 0s - loss: 0.3253 - acc: 0.849
INFO	2019-03-27 17:56:59 +0000	master-replica-0		209/254 [=======================>......] - ETA: 0s - loss: 0.3248 - acc: 0.849
INFO	2019-03-27 17:56:59 +0000	master-replica-0		224/254 [=========================>....] - ETA: 0s - loss: 0.3249 - acc: 0.849
INFO	2019-03-27 17:56:59 +0000	master-replica-0		240/254 [===========================>..] - ETA: 0s - loss: 0.3254 - acc: 0.849
INFO	2019-03-27 17:56:59 +0000	master-replica-0		254/254 [==============================] - 1s 4ms/step - loss: 0.3244 - acc: 0.8493 - val_loss: 0.3271 - val_acc: 0.8508
INFO	2019-03-27 17:56:59 +0000	master-replica-0		Epoch 00019: LearningRateScheduler reducing learning rate to 0.010000038146972657.
INFO	2019-03-27 17:56:59 +0000	master-replica-0		Epoch 19/20
INFO	2019-03-27 17:57:00 +0000	master-replica-0		  1/254 [..............................] - ETA: 1s - loss: 0.4041 - acc: 0.851
INFO	2019-03-27 17:57:00 +0000	master-replica-0		 16/254 [>.............................] - ETA: 0s - loss: 0.3522 - acc: 0.846
INFO	2019-03-27 17:57:00 +0000	master-replica-0		 33/254 [==>...........................] - ETA: 0s - loss: 0.3445 - acc: 0.849
INFO	2019-03-27 17:57:00 +0000	master-replica-0		 50/254 [====>.........................] - ETA: 0s - loss: 0.3336 - acc: 0.850
INFO	2019-03-27 17:57:00 +0000	master-replica-0		 65/254 [======>.......................] - ETA: 0s - loss: 0.3331 - acc: 0.852
INFO	2019-03-27 17:57:00 +0000	master-replica-0		 84/254 [========>.....................] - ETA: 0s - loss: 0.3301 - acc: 0.852
INFO	2019-03-27 17:57:00 +0000	master-replica-0		102/254 [===========>..................] - ETA: 0s - loss: 0.3289 - acc: 0.851
INFO	2019-03-27 17:57:00 +0000	master-replica-0		118/254 [============>.................] - ETA: 0s - loss: 0.3290 - acc: 0.850
INFO	2019-03-27 17:57:00 +0000	master-replica-0		134/254 [==============>...............] - ETA: 0s - loss: 0.3310 - acc: 0.850
INFO	2019-03-27 17:57:00 +0000	master-replica-0		154/254 [=================>............] - ETA: 0s - loss: 0.3283 - acc: 0.851
INFO	2019-03-27 17:57:00 +0000	master-replica-0		173/254 [===================>..........] - ETA: 0s - loss: 0.3270 - acc: 0.851
INFO	2019-03-27 17:57:00 +0000	master-replica-0		190/254 [=====================>........] - ETA: 0s - loss: 0.3266 - acc: 0.850
INFO	2019-03-27 17:57:00 +0000	master-replica-0		201/254 [======================>.......] - ETA: 0s - loss: 0.3260 - acc: 0.850
INFO	2019-03-27 17:57:00 +0000	master-replica-0		217/254 [========================>.....] - ETA: 0s - loss: 0.3281 - acc: 0.850
INFO	2019-03-27 17:57:00 +0000	master-replica-0		231/254 [==========================>...] - ETA: 0s - loss: 0.3281 - acc: 0.850
INFO	2019-03-27 17:57:00 +0000	master-replica-0		242/254 [===========================>..] - ETA: 0s - loss: 0.3270 - acc: 0.850
INFO	2019-03-27 17:57:00 +0000	master-replica-0		254/254 [==============================] - 1s 4ms/step - loss: 0.3281 - acc: 0.8504 - val_loss: 0.3239 - val_acc: 0.8521
INFO	2019-03-27 17:57:00 +0000	master-replica-0		Epoch 00020: LearningRateScheduler reducing learning rate to 0.010000019073486329.
INFO	2019-03-27 17:57:00 +0000	master-replica-0		Epoch 20/20
INFO	2019-03-27 17:57:00 +0000	master-replica-0		  1/254 [..............................] - ETA: 0s - loss: 0.2766 - acc: 0.867
INFO	2019-03-27 17:57:00 +0000	master-replica-0		 20/254 [=>............................] - ETA: 0s - loss: 0.3260 - acc: 0.850
INFO	2019-03-27 17:57:01 +0000	master-replica-0		 37/254 [===>..........................] - ETA: 0s - loss: 0.3271 - acc: 0.847
INFO	2019-03-27 17:57:01 +0000	master-replica-0		 58/254 [=====>........................] - ETA: 0s - loss: 0.3302 - acc: 0.847
INFO	2019-03-27 17:57:01 +0000	master-replica-0		 79/254 [========>.....................] - ETA: 0s - loss: 0.3354 - acc: 0.847
INFO	2019-03-27 17:57:01 +0000	master-replica-0		 98/254 [==========>...................] - ETA: 0s - loss: 0.3344 - acc: 0.846
INFO	2019-03-27 17:57:01 +0000	master-replica-0		119/254 [=============>................] - ETA: 0s - loss: 0.3318 - acc: 0.849
INFO	2019-03-27 17:57:01 +0000	master-replica-0		139/254 [===============>..............] - ETA: 0s - loss: 0.3314 - acc: 0.850
INFO	2019-03-27 17:57:01 +0000	master-replica-0		156/254 [=================>............] - ETA: 0s - loss: 0.3316 - acc: 0.850
INFO	2019-03-27 17:57:01 +0000	master-replica-0		175/254 [===================>..........] - ETA: 0s - loss: 0.3317 - acc: 0.851
INFO	2019-03-27 17:57:01 +0000	master-replica-0		192/254 [=====================>........] - ETA: 0s - loss: 0.3309 - acc: 0.850
INFO	2019-03-27 17:57:01 +0000	master-replica-0		212/254 [========================>.....] - ETA: 0s - loss: 0.3296 - acc: 0.850
INFO	2019-03-27 17:57:01 +0000	master-replica-0		231/254 [==========================>...] - ETA: 0s - loss: 0.3282 - acc: 0.850
INFO	2019-03-27 17:57:01 +0000	master-replica-0		250/254 [============================>.] - ETA: 0s - loss: 0.3294 - acc: 0.850
INFO	2019-03-27 17:57:01 +0000	master-replica-0		254/254 [==============================] - 1s 3ms/step - loss: 0.3294 - acc: 0.8508 - val_loss: 0.3282 - val_acc: 0.8519
WARNING	2019-03-27 17:57:03 +0000	master-replica-0		This model was compiled with a Keras optimizer (<tensorflow.python.keras.optimizers.RMSprop object at 0x7f3904fa0518>) but is being saved in TensorFlow format with `save_weights`. The model's weights will be saved, but unlike with TensorFlow optimizers in the TensorFlow format the optimizer's state will not be saved.
WARNING	2019-03-27 17:57:03 +0000	master-replica-0		
WARNING	2019-03-27 17:57:03 +0000	master-replica-0		Consider using a TensorFlow optimizer from `tf.train`.
WARNING	2019-03-27 17:57:05 +0000	master-replica-0		From /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/network.py:1436: update_checkpoint_state (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
WARNING	2019-03-27 17:57:05 +0000	master-replica-0		Instructions for updating:
WARNING	2019-03-27 17:57:05 +0000	master-replica-0		Use tf.train.CheckpointManager to manage checkpoints rather than manually editing the Checkpoint proto.
WARNING	2019-03-27 17:57:05 +0000	master-replica-0		Model was compiled with an optimizer, but the optimizer is not from `tf.train` (e.g. `tf.train.AdagradOptimizer`). Only the serving graph was exported. The train and evaluate graphs were not added to the SavedModel.
WARNING	2019-03-27 17:57:10 +0000	master-replica-0		From /usr/local/lib/python3.7/dist-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:205: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
WARNING	2019-03-27 17:57:10 +0000	master-replica-0		Instructions for updating:
WARNING	2019-03-27 17:57:10 +0000	master-replica-0		This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
INFO	2019-03-27 17:57:10 +0000	master-replica-0		Signatures INCLUDED in export for Classify: None
INFO	2019-03-27 17:57:10 +0000	master-replica-0		Signatures INCLUDED in export for Eval: None
INFO	2019-03-27 17:57:10 +0000	master-replica-0		Signatures INCLUDED in export for Predict: ['serving_default']
INFO	2019-03-27 17:57:10 +0000	master-replica-0		Signatures INCLUDED in export for Regress: None
INFO	2019-03-27 17:57:10 +0000	master-replica-0		Signatures INCLUDED in export for Train: None
INFO	2019-03-27 17:57:10 +0000	master-replica-0		No assets to save.
INFO	2019-03-27 17:57:10 +0000	master-replica-0		No assets to write.
INFO	2019-03-27 17:57:10 +0000	master-replica-0		SavedModel written to: gs://<your-bucket-name>/keras-job-dir/keras_export/1553709421/saved_model.pb
INFO	2019-03-27 17:57:11 +0000	master-replica-0		Model exported to:  gs://<your-bucket-name>/keras-job-dir/keras_export/1553709421
INFO	2019-03-27 17:57:11 +0000	master-replica-0		Module completed; cleaning up.
INFO	2019-03-27 17:57:11 +0000	master-replica-0		Clean up finished.
INFO	2019-03-27 17:57:11 +0000	master-replica-0		Task completed successfully.
endTime: '2019-03-27T18:01:46'
jobId: my_first_keras_job
startTime: '2019-03-27T17:55:34'
state: SUCCEEDED

Part 2. Quickstart for online predictions in AI Platform

This section shows how to use AI Platform and your trained model from Part 1 to predict a person's income bracket from other Census information about them.

Create model and version resources in AI Platform

To serve online predictions using the model you trained and exported in Part 1, create a model resource in AI Platform and a version resource within it. The version resource is what actually uses your trained model to serve predictions. This structure lets you adjust and retrain your model many times and organize all the versions together in AI Platform. Learn more about models and versions.

First, name and create the model resource:


In [0]:
MODEL_NAME = "my_first_keras_model"

! gcloud ai-platform models create $MODEL_NAME \
  --regions $REGION


Created AI Platform model [projects/<your-project-id>/models/my_first_keras_model].

Next, create the model version. The training job from Part 1 exported a timestamped TensorFlow SavedModel directory to your Cloud Storage bucket. AI Platform uses this directory to create a model version. Learn more about SavedModel and AI Platform.

You may be able to find the path to this directory in your training job's logs. Look for a line like:

Model exported to:  gs://<your-bucket-name>/keras-job-dir/keras_export/1545439782

Execute the following command to identify your SavedModel directory and use it to create a model version resource:


In [0]:
MODEL_VERSION = "v1"

# Get a list of directories in the `keras_export` parent directory
KERAS_EXPORT_DIRS = ! gsutil ls $JOB_DIR/keras_export/

# Pick the directory with the latest timestamp, in case you've trained
# multiple times
SAVED_MODEL_PATH = KERAS_EXPORT_DIRS[-1]

# Create model version based on that SavedModel directory
! gcloud ai-platform versions create $MODEL_VERSION \
  --model $MODEL_NAME \
  --runtime-version 1.15 \
  --python-version 3.7 \
  --framework tensorflow \
  --origin $SAVED_MODEL_PATH

Prepare input for prediction

To receive valid and useful predictions, you must preprocess input for prediction in the same way that training data was preprocessed. In a production system, you may want to create a preprocessing pipeline that can be used identically at training time and prediction time.

For this exercise, use the training package's data-loading code to select a random sample from the evaluation data. This data is in the form that was used to evaluate accuracy after each epoch of training, so it can be used to send test predictions without further preprocessing:


In [0]:
from trainer import util

_, _, eval_x, eval_y = util.load_data()

prediction_input = eval_x.sample(20)
prediction_targets = eval_y[prediction_input.index]

prediction_input


Out[0]:
age workclass education_num marital_status occupation relationship race capital_gain capital_loss hours_per_week native_country
1979 0.901213 1 1.525542 2 9 0 4 -0.144792 -0.217132 -0.437544 38
2430 -0.922154 3 -0.419265 4 2 3 4 -0.144792 -0.217132 -0.034039 38
4214 -1.213893 3 -0.030304 4 10 1 4 -0.144792 -0.217132 1.579979 38
10389 -0.630415 3 0.358658 4 0 3 4 -0.144792 -0.217132 -0.679647 38
14525 -1.505632 3 -1.586149 4 7 3 0 -0.144792 -0.217132 -0.034039 38
15040 -0.119873 5 0.358658 2 2 0 4 -0.144792 -0.217132 -0.841048 38
8409 0.244801 3 1.525542 2 9 0 4 -0.144792 -0.217132 1.176475 6
10628 0.098931 1 1.525542 2 9 0 4 0.886847 -0.217132 -0.034039 38
10942 0.390670 5 -0.030304 2 4 0 4 -0.144792 -0.217132 4.727315 38
5129 1.120017 3 1.136580 2 12 0 4 -0.144792 -0.217132 -0.034039 38
2096 -1.286827 3 -0.030304 4 11 3 4 -0.144792 -0.217132 -1.648058 38
12463 -0.703350 3 -0.419265 2 7 5 4 -0.144792 4.502280 -0.437544 38
8528 0.536539 3 1.525542 4 3 4 4 -0.144792 -0.217132 -0.034039 38
7093 -1.359762 3 -0.419265 4 6 3 2 -0.144792 -0.217132 -0.034039 38
12565 0.536539 3 1.136580 0 11 2 2 -0.144792 -0.217132 -0.034039 38
5655 1.338821 3 -0.419265 2 2 0 4 -0.144792 -0.217132 -0.034039 38
2322 0.682409 3 1.136580 0 12 3 4 -0.144792 -0.217132 -0.034039 38
12652 0.025997 3 1.136580 2 11 0 4 -0.144792 -0.217132 0.369465 38
4755 -0.411611 3 -0.419265 2 11 0 4 -0.144792 -0.217132 1.176475 38
4413 0.390670 6 1.136580 4 4 1 4 -0.144792 -0.217132 -0.034039 38

Notice that categorical fields, like occupation, have already been converted to integers (with the same mapping that was used for training). Numerical fields, like age, have been scaled to a z-score. Some fields have been dropped from the original data. Compare the prediction input with the raw data for the same examples:


In [0]:
import pandas as pd

_, eval_file_path = util.download(util.DATA_DIR)
raw_eval_data = pd.read_csv(eval_file_path,
                            names=util._CSV_COLUMNS,
                            na_values='?')

raw_eval_data.iloc[prediction_input.index]


Out[0]:
age workclass fnlwgt education education_num marital_status occupation relationship race gender capital_gain capital_loss hours_per_week native_country income_bracket
1979 51 Local-gov 99064 Masters 14 Married-civ-spouse Prof-specialty Husband White Male 0 0 35 United-States <=50K
2430 26 Private 197967 HS-grad 9 Never-married Craft-repair Own-child White Male 0 0 40 United-States <=50K
4214 22 Private 221694 Some-college 10 Never-married Protective-serv Not-in-family White Male 0 0 60 United-States <=50K
10389 30 Private 96480 Assoc-voc 11 Never-married Adm-clerical Own-child White Female 0 0 32 United-States <=50K
14525 18 Private 146225 10th 6 Never-married Other-service Own-child Amer-Indian-Eskimo Female 0 0 40 United-States <=50K
15040 37 Self-emp-not-inc 50096 Assoc-voc 11 Married-civ-spouse Craft-repair Husband White Male 0 0 30 United-States <=50K
8409 42 Private 102988 Masters 14 Married-civ-spouse Prof-specialty Husband White Male 0 0 55 Ecuador >50K
10628 40 Local-gov 284086 Masters 14 Married-civ-spouse Prof-specialty Husband White Male 7688 0 40 United-States >50K
10942 44 Self-emp-not-inc 52505 Some-college 10 Married-civ-spouse Farming-fishing Husband White Male 0 0 99 United-States <=50K
5129 54 Private 106728 Bachelors 13 Married-civ-spouse Tech-support Husband White Male 0 0 40 United-States <=50K
2096 21 Private 190916 Some-college 10 Never-married Sales Own-child White Female 0 0 20 United-States <=50K
12463 29 Private 197565 HS-grad 9 Married-civ-spouse Other-service Wife White Female 0 1902 35 United-States >50K
8528 46 Private 193188 Masters 14 Never-married Exec-managerial Unmarried White Male 0 0 40 United-States <=50K
7093 20 Private 273147 HS-grad 9 Never-married Machine-op-inspct Own-child Black Male 0 0 40 United-States <=50K
12565 46 Private 203653 Bachelors 13 Divorced Sales Other-relative Black Male 0 0 40 United-States <=50K
5655 57 Private 174662 HS-grad 9 Married-civ-spouse Craft-repair Husband White Male 0 0 40 United-States <=50K
2322 48 Private 232149 Bachelors 13 Divorced Tech-support Own-child White Female 0 0 40 United-States <=50K
12652 39 Private 82521 Bachelors 13 Married-civ-spouse Sales Husband White Male 0 0 45 United-States >50K
4755 33 Private 330715 HS-grad 9 Married-civ-spouse Sales Husband White Male 0 0 55 United-States <=50K
4413 44 State-gov 128586 Bachelors 13 Never-married Farming-fishing Not-in-family White Male 0 0 40 United-States <=50K

Export the prediction input to a newline-delimited JSON file:


In [0]:
import json

with open('prediction_input.json', 'w') as json_file:
  for row in prediction_input.values.tolist():
    json.dump(row, json_file)
    json_file.write('\n')

! cat prediction_input.json


[0.9012127751273994, 1.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.43754385253479555, 38.0]
[-0.9221541171760282, 3.0, -0.4192650914017433, 4.0, 2.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-1.2138928199445767, 3.0, -0.030303770229214273, 4.0, 10.0, 1.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.5799792247041626, 38.0]
[-0.6304154144074798, 3.0, 0.35865755094331475, 4.0, 0.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.6796466218034705, 38.0]
[-1.5056315227131252, 3.0, -1.5861490549193304, 4.0, 7.0, 3.0, 0.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-0.11987268456252011, 5.0, 0.35865755094331475, 2.0, 2.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.8410484679825871, 38.0]
[0.24480069389816542, 3.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.176474609256371, 6.0]
[0.0989313425138912, 1.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, 0.8868473744801746, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.39067004528243965, 5.0, -0.030303770229214273, 2.0, 4.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 4.7273152251969375, 38.0]
[1.1200168022038106, 3.0, 1.1365801932883728, 2.0, 12.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-1.2868274956367138, 3.0, -0.030303770229214273, 4.0, 11.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -1.6480576988781703, 38.0]
[-0.7033500900996169, 3.0, -0.4192650914017433, 2.0, 7.0, 5.0, 4.0, -0.14479173735784842, 4.5022796885373735, -0.43754385253479555, 38.0]
[0.5365393966667138, 3.0, 1.525541514460902, 4.0, 3.0, 4.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-1.3597621713288508, 3.0, -0.4192650914017433, 4.0, 6.0, 3.0, 2.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.5365393966667138, 3.0, 1.1365801932883728, 0.0, 11.0, 2.0, 2.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[1.338820829280222, 3.0, -0.4192650914017433, 2.0, 2.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.6824087480509881, 3.0, 1.1365801932883728, 0.0, 12.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.0259966668217541, 3.0, 1.1365801932883728, 2.0, 11.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 0.3694653783607877, 38.0]
[-0.4116113873310685, 3.0, -0.4192650914017433, 2.0, 11.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.176474609256371, 38.0]
[0.39067004528243965, 6.0, 1.1365801932883728, 4.0, 4.0, 1.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]

The gcloud command-line tool accepts newline-delimited JSON for online prediction, and this particular Keras model expects a flat list of numbers for each input example.

AI Platform requires a different format when you make online prediction requests to the REST API without using the gcloud tool. The way you structure your model may also change how you must format data for prediction. Learn more about formatting data for online prediction.

Submit the online prediction request

Use gcloud to submit your online prediction request.


In [0]:
! gcloud ai-platform predict \
  --model $MODEL_NAME \
  --version $MODEL_VERSION \
  --json-instances prediction_input.json


DENSE_4
[0.6854287385940552]
[0.011786997318267822]
[0.037236183881759644]
[0.016223609447479248]
[0.0012015104293823242]
[0.23621389269828796]
[0.6174039244651794]
[0.9822691679000854]
[0.3815768361091614]
[0.6715215444564819]
[0.001094043254852295]
[0.43077391386032104]
[0.22132840752601624]
[0.004075437784194946]
[0.22736871242523193]
[0.4111979305744171]
[0.27328649163246155]
[0.6981356143951416]
[0.3309604525566101]
[0.20807647705078125]

Since the model's last layer uses a sigmoid function for its activation, outputs between 0 and 0.5 represent negative predictions ("<=50K") and outputs between 0.5 and 1 represent positive ones (">50K").

Do the predicted income brackets match the actual ones? Run the following cell to see the true labels.


In [0]:
prediction_targets


Out[0]:
array([[0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [1.],
       [1.],
       [0.],
       [0.],
       [0.],
       [1.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [1.],
       [0.],
       [0.]], dtype=float32)

Part 3. Developing the Keras model from scratch

At this point, you have trained a machine learning model on AI Platform, deployed the trained model as a version resource on AI Platform, and received online predictions from the deployment. The next section walks through recreating the Keras code used to train your model. It covers the following parts of developing a machine learning model for use with AI Platform:

  • Downloading and preprocessing data
  • Designing and training the model
  • Visualizing training and exporting the trained model

While this section provides more detailed insight to the tasks completed in previous parts, to learn more about using tf.keras, read TensorFlow's guide to Keras. To learn more about structuring code as a training packge for AI Platform, read Packaging a training application and reference the complete training code, which is structured as a Python package.

Import libraries and define constants

First, import Python libraries required for training:


In [0]:
import os
from six.moves import urllib
import tempfile

import numpy as np
import pandas as pd
import tensorflow as tf

# Examine software versions
print(__import__('sys').version)
print(tf.__version__)
print(tf.keras.__version__)


3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0]
1.15.2
2.2.4-tf

Then, define some useful constants:

  • Information for downloading training and evaluation data
  • Information required for Pandas to interpret the data and convert categorical fields into numeric features
  • Hyperparameters for training, such as learning rate and batch size

In [0]:
### For downloading data ###

# Storage directory
DATA_DIR = os.path.join(tempfile.gettempdir(), 'census_data')

# Download options.
DATA_URL = 'https://storage.googleapis.com/cloud-samples-data/ai-platform' \
           '/census/data'
TRAINING_FILE = 'adult.data.csv'
EVAL_FILE = 'adult.test.csv'
TRAINING_URL = '%s/%s' % (DATA_URL, TRAINING_FILE)
EVAL_URL = '%s/%s' % (DATA_URL, EVAL_FILE)

### For interpreting data ###

# These are the features in the dataset.
# Dataset information: https://archive.ics.uci.edu/ml/datasets/census+income
_CSV_COLUMNS = [
    'age', 'workclass', 'fnlwgt', 'education', 'education_num',
    'marital_status', 'occupation', 'relationship', 'race', 'gender',
    'capital_gain', 'capital_loss', 'hours_per_week', 'native_country',
    'income_bracket'
]

_CATEGORICAL_TYPES = {
    'workclass': pd.api.types.CategoricalDtype(categories=[
        'Federal-gov', 'Local-gov', 'Never-worked', 'Private', 'Self-emp-inc',
        'Self-emp-not-inc', 'State-gov', 'Without-pay'
    ]),
    'marital_status': pd.api.types.CategoricalDtype(categories=[
        'Divorced', 'Married-AF-spouse', 'Married-civ-spouse',
        'Married-spouse-absent', 'Never-married', 'Separated', 'Widowed'
    ]),
    'occupation': pd.api.types.CategoricalDtype([
        'Adm-clerical', 'Armed-Forces', 'Craft-repair', 'Exec-managerial',
        'Farming-fishing', 'Handlers-cleaners', 'Machine-op-inspct',
        'Other-service', 'Priv-house-serv', 'Prof-specialty', 'Protective-serv',
        'Sales', 'Tech-support', 'Transport-moving'
    ]),
    'relationship': pd.api.types.CategoricalDtype(categories=[
        'Husband', 'Not-in-family', 'Other-relative', 'Own-child', 'Unmarried',
        'Wife'
    ]),
    'race': pd.api.types.CategoricalDtype(categories=[
        'Amer-Indian-Eskimo', 'Asian-Pac-Islander', 'Black', 'Other', 'White'
    ]),
    'native_country': pd.api.types.CategoricalDtype(categories=[
        'Cambodia', 'Canada', 'China', 'Columbia', 'Cuba', 'Dominican-Republic',
        'Ecuador', 'El-Salvador', 'England', 'France', 'Germany', 'Greece',
        'Guatemala', 'Haiti', 'Holand-Netherlands', 'Honduras', 'Hong', 'Hungary',
        'India', 'Iran', 'Ireland', 'Italy', 'Jamaica', 'Japan', 'Laos', 'Mexico',
        'Nicaragua', 'Outlying-US(Guam-USVI-etc)', 'Peru', 'Philippines', 'Poland',
        'Portugal', 'Puerto-Rico', 'Scotland', 'South', 'Taiwan', 'Thailand',
        'Trinadad&Tobago', 'United-States', 'Vietnam', 'Yugoslavia'
    ]),
    'income_bracket': pd.api.types.CategoricalDtype(categories=[
        '<=50K', '>50K'
    ])
}

# This is the label (target) we want to predict.
_LABEL_COLUMN = 'income_bracket'

### Hyperparameters for training ###

# This the training batch size
BATCH_SIZE = 128

# This is the number of epochs (passes over the full training data)
NUM_EPOCHS = 20

# Define learning rate.
LEARNING_RATE = .01

Download and preprocess data

Download the data

Next, define functions to download training and evaluation data. These functions also fix minor irregularities in the data's formatting.


In [0]:
def _download_and_clean_file(filename, url):
    """Downloads data from url, and makes changes to match the CSV format.
  
    The CSVs may use spaces after the comma delimters (non-standard) or include
    rows which do not represent well-formed examples. This function strips out
    some of these problems.
  
    Args:
      filename: filename to save url to
      url: URL of resource to download
    """
    temp_file, _ = urllib.request.urlretrieve(url)
    with tf.io.gfile.GFile(temp_file, 'r') as temp_file_object:
        with tf.io.gfile.GFile(filename, 'w') as file_object:
            for line in temp_file_object:
                line = line.strip()
                line = line.replace(', ', ',')
                if not line or ',' not in line:
                    continue
                if line[-1] == '.':
                    line = line[:-1]
                line += '\n'
                file_object.write(line)
    tf.io.gfile.remove(temp_file)


def download(data_dir):
    """Downloads census data if it is not already present.
  
    Args:
      data_dir: directory where we will access/save the census data
    """
    tf.io.gfile.makedirs(data_dir)

    training_file_path = os.path.join(data_dir, TRAINING_FILE)
    if not tf.io.gfile.exists(training_file_path):
        _download_and_clean_file(training_file_path, TRAINING_URL)

    eval_file_path = os.path.join(data_dir, EVAL_FILE)
    if not tf.io.gfile.exists(eval_file_path):
        _download_and_clean_file(eval_file_path, EVAL_URL)

    return training_file_path, eval_file_path

Use those functions to download the data for training and verify that you have CSV files for training and evaluation:


In [0]:
training_file_path, eval_file_path = download(DATA_DIR)

# You should see 2 files: adult.data.csv and adult.test.csv
!ls -l $DATA_DIR


total 5156
-rw-r--r-- 1 root root 3518450 Mar 27 17:52 adult.data.csv
-rw-r--r-- 1 root root 1758573 Mar 27 17:52 adult.test.csv

Next, load these files using Pandas and examine the data:


In [0]:
# This census data uses the value '?' for fields (column) that are missing data. 
# We use na_values to find ? and set it to NaN values.
# https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

train_df = pd.read_csv(training_file_path, names=_CSV_COLUMNS, na_values='?')
eval_df = pd.read_csv(eval_file_path, names=_CSV_COLUMNS, na_values='?')

# Here's what the data looks like before we preprocess the data.
train_df.head()


Out[0]:
age workclass fnlwgt education education_num marital_status occupation relationship race gender capital_gain capital_loss hours_per_week native_country income_bracket
0 39 State-gov 77516 Bachelors 13 Never-married Adm-clerical Not-in-family White Male 2174 0 40 United-States <=50K
1 50 Self-emp-not-inc 83311 Bachelors 13 Married-civ-spouse Exec-managerial Husband White Male 0 0 13 United-States <=50K
2 38 Private 215646 HS-grad 9 Divorced Handlers-cleaners Not-in-family White Male 0 0 40 United-States <=50K
3 53 Private 234721 11th 7 Married-civ-spouse Handlers-cleaners Husband Black Male 0 0 40 United-States <=50K
4 28 Private 338409 Bachelors 13 Married-civ-spouse Prof-specialty Wife Black Female 0 0 40 Cuba <=50K

Preprocess the data

The first preprocessing step removes certain features from the data and converts categorical features to numerical values for use with Keras.

Learn more about feature engineering and bias in data.


In [0]:
UNUSED_COLUMNS = ['fnlwgt', 'education', 'gender']


def preprocess(dataframe):
    """Converts categorical features to numeric. Removes unused columns.
  
    Args:
      dataframe: Pandas dataframe with raw data
  
    Returns:
      Dataframe with preprocessed data
    """
    dataframe = dataframe.drop(columns=UNUSED_COLUMNS)

    # Convert integer valued (numeric) columns to floating point
    numeric_columns = dataframe.select_dtypes(['int64']).columns
    dataframe[numeric_columns] = dataframe[numeric_columns].astype('float32')

    # Convert categorical columns to numeric
    cat_columns = dataframe.select_dtypes(['object']).columns
    dataframe[cat_columns] = dataframe[cat_columns].apply(lambda x: x.astype(
        _CATEGORICAL_TYPES[x.name]))
    dataframe[cat_columns] = dataframe[cat_columns].apply(lambda x: x.cat.codes)
    return dataframe


prepped_train_df = preprocess(train_df)
prepped_eval_df = preprocess(eval_df)

Run the following cell to see how preprocessing changed the data. Notice in particular that income_bracket, the label that you're training the model to predict, has changed from <=50K and >50K to 0 and 1:


In [0]:
prepped_train_df.head()


Out[0]:
age workclass education_num marital_status occupation relationship race capital_gain capital_loss hours_per_week native_country income_bracket
0 39.0 6 13.0 4 0 1 4 2174.0 0.0 40.0 38 0
1 50.0 5 13.0 2 3 0 4 0.0 0.0 13.0 38 0
2 38.0 3 9.0 0 5 1 4 0.0 0.0 40.0 38 0
3 53.0 3 7.0 2 5 0 2 0.0 0.0 40.0 38 0
4 28.0 3 13.0 2 9 5 2 0.0 0.0 40.0 4 0

Next, separate the data into features ("x") and labels ("y"), and reshape the label arrays into a format for use with tf.data.Dataset later:


In [0]:
# Split train and test data with labels.
# The pop() method will extract (copy) and remove the label column from the dataframe
train_x, train_y = prepped_train_df, prepped_train_df.pop(_LABEL_COLUMN)
eval_x, eval_y = prepped_eval_df, prepped_eval_df.pop(_LABEL_COLUMN)

# Reshape label columns for use with tf.data.Dataset
train_y = np.asarray(train_y).astype('float32').reshape((-1, 1))
eval_y = np.asarray(eval_y).astype('float32').reshape((-1, 1))

Scaling training data so each numerical feature column has a mean of 0 and a standard deviation of 1 can improve your model.

In a production system, you may want to save the means and standard deviations from your training set and use them to perform an identical transformation on test data at prediction time. For convenience in this exercise, temporarily combine the training and evaluation data to scale all of them:


In [0]:
def standardize(dataframe):
    """Scales numerical columns using their means and standard deviation to get
    z-scores: the mean of each numerical column becomes 0, and the standard
    deviation becomes 1. This can help the model converge during training.
  
    Args:
      dataframe: Pandas dataframe
  
    Returns:
      Input dataframe with the numerical columns scaled to z-scores
    """
    dtypes = list(zip(dataframe.dtypes.index, map(str, dataframe.dtypes)))
    # Normalize numeric columns.
    for column, dtype in dtypes:
        if dtype == 'float32':
            dataframe[column] -= dataframe[column].mean()
            dataframe[column] /= dataframe[column].std()
    return dataframe


# Join train_x and eval_x to normalize on overall means and standard
# deviations. Then separate them again.
all_x = pd.concat([train_x, eval_x], keys=['train', 'eval'])
all_x = standardize(all_x)
train_x, eval_x = all_x.xs('train'), all_x.xs('eval')

Finally, examine some of your fully preprocessed training data:


In [0]:
# Verify dataset features
# Note how only the numeric fields (not categorical) have been standardized
train_x.head()


Out[0]:
age workclass education_num marital_status occupation relationship race capital_gain capital_loss hours_per_week native_country
0 0.025997 6 1.136580 4 0 1 4 0.146933 -0.217132 -0.034039 38
1 0.828278 5 1.136580 2 3 0 4 -0.144792 -0.217132 -2.212964 38
2 -0.046938 3 -0.419265 0 5 1 4 -0.144792 -0.217132 -0.034039 38
3 1.047082 3 -1.197188 2 5 0 2 -0.144792 -0.217132 -0.034039 38
4 -0.776285 3 1.136580 2 9 5 2 -0.144792 -0.217132 -0.034039 4

Design and train the model

Create training and validation datasets

Create an input function to convert features and labels into a tf.data.Dataset for training or evaluation:


In [0]:
def input_fn(features, labels, shuffle, num_epochs, batch_size):
    """Generates an input function to be used for model training.
  
    Args:
      features: numpy array of features used for training or inference
      labels: numpy array of labels for each example
      shuffle: boolean for whether to shuffle the data or not (set True for
        training, False for evaluation)
      num_epochs: number of epochs to provide the data for
      batch_size: batch size for training
  
    Returns:
      A tf.data.Dataset that can provide data to the Keras model for training or
        evaluation
    """
    if labels is None:
        inputs = features
    else:
        inputs = (features, labels)
    dataset = tf.data.Dataset.from_tensor_slices(inputs)

    if shuffle:
        dataset = dataset.shuffle(buffer_size=len(features))

    # We call repeat after shuffling, rather than before, to prevent separate
    # epochs from blending together.
    dataset = dataset.repeat(num_epochs)
    dataset = dataset.batch(batch_size)
    return dataset

Next, create these training and evaluation datasets.Use the NUM_EPOCHS and BATCH_SIZE hyperparameters defined previously to define how the training dataset provides examples to the model during training. Set up the validation dataset to provide all its examples in one batch, for a single validation step at the end of each training epoch.


In [0]:
# Pass a numpy array by using DataFrame.values
training_dataset = input_fn(features=train_x.values, 
                    labels=train_y, 
                    shuffle=True, 
                    num_epochs=NUM_EPOCHS, 
                    batch_size=BATCH_SIZE)

num_eval_examples = eval_x.shape[0]

# Pass a numpy array by using DataFrame.values
validation_dataset = input_fn(features=eval_x.values, 
                    labels=eval_y, 
                    shuffle=False, 
                    num_epochs=NUM_EPOCHS, 
                    batch_size=num_eval_examples)

Design a Keras Model

Design your neural network using the Keras Sequential API.

This deep neural network (DNN) has several hidden layers, and the last layer uses a sigmoid activation function to output a value between 0 and 1:

  • The input layer has 100 units using the ReLU activation function.
  • The hidden layer has 75 units using the ReLU activation function.
  • The hidden layer has 50 units using the ReLU activation function.
  • The hidden layer has 25 units using the ReLU activation function.
  • The output layer has 1 units using a sigmoid activation function.
  • The optimizer uses the binary cross-entropy loss function, which is appropriate for a binary classification problem like this one.

Feel free to change these layers to try to improve the model:


In [0]:
def create_keras_model(input_dim, learning_rate):
    """Creates Keras Model for Binary Classification.
  
    Args:
      input_dim: How many features the input has
      learning_rate: Learning rate for training
  
    Returns:
      The compiled Keras model (still needs to be trained)
    """
    Dense = tf.keras.layers.Dense
    model = tf.keras.Sequential(
      [
          Dense(100, activation=tf.nn.relu, kernel_initializer='uniform',
                  input_shape=(input_dim,)),
          Dense(75, activation=tf.nn.relu),
          Dense(50, activation=tf.nn.relu),
          Dense(25, activation=tf.nn.relu),
          Dense(1, activation=tf.nn.sigmoid)
      ])
    # Custom Optimizer:
    # https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer
    optimizer = tf.keras.optimizers.RMSprop(
        lr=learning_rate)

    # Compile Keras model
    model.compile(
        loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

Next, create the Keras model object and examine its structure:


In [0]:
num_train_examples, input_dim = train_x.shape
print('Number of features: {}'.format(input_dim))
print('Number of examples: {}'.format(num_train_examples))

keras_model = create_keras_model(
    input_dim=input_dim,
    learning_rate=LEARNING_RATE)

# Take a detailed look inside the model
keras_model.summary()


Number of features: 11
Number of examples: 32561
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 100)               1200      
_________________________________________________________________
dense_1 (Dense)              (None, 75)                7575      
_________________________________________________________________
dense_2 (Dense)              (None, 50)                3800      
_________________________________________________________________
dense_3 (Dense)              (None, 25)                1275      
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 26        
=================================================================
Total params: 13,876
Trainable params: 13,876
Non-trainable params: 0
_________________________________________________________________

Train and evaluate the model

Define a learning rate decay to encourage model paramaters to make smaller changes as training goes on:


In [0]:
# Setup Learning Rate decay.
lr_decay_cb = tf.keras.callbacks.LearningRateScheduler(
    lambda epoch: LEARNING_RATE + 0.02 * (0.5 ** (1 + epoch)),
    verbose=True)

# Setup TensorBoard callback.
tensorboard_cb = tf.keras.callbacks.TensorBoard(
      os.path.join(JOB_DIR, 'keras_tensorboard'),
      histogram_freq=1)

Finally, train the model. Provide the appropriate steps_per_epoch for the model to train on the entire training dataset (with BATCH_SIZE examples per step) during each epoch. And instruct the model to calculate validation accuracy with one big validation batch at the end of each epoch.


In [0]:
history = keras_model.fit(training_dataset, 
                          epochs=NUM_EPOCHS, 
                          steps_per_epoch=int(num_train_examples/BATCH_SIZE), 
                          validation_data=validation_dataset, 
                          validation_steps=1, 
                          callbacks=[lr_decay_cb, tensorboard_cb],
                          verbose=1)


WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.

Epoch 00001: LearningRateScheduler reducing learning rate to 0.02.
Epoch 1/20
254/254 [==============================] - 1s 5ms/step - loss: 0.6986 - acc: 0.7893 - val_loss: 0.3894 - val_acc: 0.8329

Epoch 00002: LearningRateScheduler reducing learning rate to 0.015.
Epoch 2/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3574 - acc: 0.8335 - val_loss: 0.3861 - val_acc: 0.8131

Epoch 00003: LearningRateScheduler reducing learning rate to 0.0125.
Epoch 3/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3422 - acc: 0.8404 - val_loss: 0.3304 - val_acc: 0.8445

Epoch 00004: LearningRateScheduler reducing learning rate to 0.01125.
Epoch 4/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3353 - acc: 0.8465 - val_loss: 0.3610 - val_acc: 0.8435

Epoch 00005: LearningRateScheduler reducing learning rate to 0.010625.
Epoch 5/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3315 - acc: 0.8457 - val_loss: 0.3288 - val_acc: 0.8445

Epoch 00006: LearningRateScheduler reducing learning rate to 0.0103125.
Epoch 6/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3294 - acc: 0.8475 - val_loss: 0.3331 - val_acc: 0.8489

Epoch 00007: LearningRateScheduler reducing learning rate to 0.01015625.
Epoch 7/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3296 - acc: 0.8476 - val_loss: 0.3296 - val_acc: 0.8508

Epoch 00008: LearningRateScheduler reducing learning rate to 0.010078125.
Epoch 8/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3287 - acc: 0.8486 - val_loss: 0.3254 - val_acc: 0.8494

Epoch 00009: LearningRateScheduler reducing learning rate to 0.0100390625.
Epoch 9/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3274 - acc: 0.8483 - val_loss: 0.3205 - val_acc: 0.8511

Epoch 00010: LearningRateScheduler reducing learning rate to 0.01001953125.
Epoch 10/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3274 - acc: 0.8493 - val_loss: 0.3233 - val_acc: 0.8483

Epoch 00011: LearningRateScheduler reducing learning rate to 0.010009765625.
Epoch 11/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3268 - acc: 0.8485 - val_loss: 0.3315 - val_acc: 0.8511

Epoch 00012: LearningRateScheduler reducing learning rate to 0.010004882812500001.
Epoch 12/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3258 - acc: 0.8500 - val_loss: 0.3328 - val_acc: 0.8502

Epoch 00013: LearningRateScheduler reducing learning rate to 0.01000244140625.
Epoch 13/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3278 - acc: 0.8488 - val_loss: 0.3196 - val_acc: 0.8536

Epoch 00014: LearningRateScheduler reducing learning rate to 0.010001220703125.
Epoch 14/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3261 - acc: 0.8508 - val_loss: 0.3355 - val_acc: 0.8384

Epoch 00015: LearningRateScheduler reducing learning rate to 0.0100006103515625.
Epoch 15/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3249 - acc: 0.8508 - val_loss: 0.3379 - val_acc: 0.8478

Epoch 00016: LearningRateScheduler reducing learning rate to 0.01000030517578125.
Epoch 16/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3277 - acc: 0.8485 - val_loss: 0.3253 - val_acc: 0.8524

Epoch 00017: LearningRateScheduler reducing learning rate to 0.010000152587890625.
Epoch 17/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3251 - acc: 0.8521 - val_loss: 0.3261 - val_acc: 0.8512

Epoch 00018: LearningRateScheduler reducing learning rate to 0.010000076293945313.
Epoch 18/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3287 - acc: 0.8494 - val_loss: 0.3232 - val_acc: 0.8543

Epoch 00019: LearningRateScheduler reducing learning rate to 0.010000038146972657.
Epoch 19/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3239 - acc: 0.8512 - val_loss: 0.3334 - val_acc: 0.8496

Epoch 00020: LearningRateScheduler reducing learning rate to 0.010000019073486329.
Epoch 20/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3279 - acc: 0.8504 - val_loss: 0.3174 - val_acc: 0.8523

Visualize training and export the trained model

Visualize training

Import matplotlib to visualize how the model learned over the training period.


In [0]:
! pip install matplotlib

from matplotlib import pyplot as plt

%matplotlib inline


Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (3.0.3)
Requirement already satisfied: numpy>=1.10.0 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (1.14.6)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (2.5.3)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (1.0.1)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (2.3.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (0.10.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib) (1.11.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib) (40.8.0)

Plot the model's loss (binary cross-entropy) and accuracy, as measured at the end of each training epoch:


In [0]:
# Visualize History for Loss.
plt.title('Keras model loss')
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['training', 'validation'], loc='upper right')
plt.show()

# Visualize History for Accuracy.
plt.title('Keras model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.legend(['training', 'validation'], loc='lower right')
plt.show()


Over time, loss decreases and accuracy increases. But do they converge to a stable level? Are there big differences between the training and validation metrics (a sign of overfitting)?

Learn about how to improve your machine learning model. Then, feel free to adjust hyperparameters or the model architecture and train again.

Export the model for serving

AI Platform requires when you create a model version resource.

Since not all optimizers can be exported to the SavedModel format, you may see warnings during the export process. As long you successfully export a serving graph, AI Platform can used the SavedModel to serve predictions.


In [0]:
# Export the model to a local SavedModel directory 
export_path = tf.keras.experimental.export_saved_model(keras_model, 'keras_export')
print("Model exported to: ", export_path)


WARNING:tensorflow:This model was compiled with a Keras optimizer (<tensorflow.python.keras.optimizers.RMSprop object at 0x7fc198c4e400>) but is being saved in TensorFlow format with `save_weights`. The model's weights will be saved, but unlike with TensorFlow optimizers in the TensorFlow format the optimizer's state will not be saved.

Consider using a TensorFlow optimizer from `tf.train`.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/network.py:1436: update_checkpoint_state (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.train.CheckpointManager to manage checkpoints rather than manually editing the Checkpoint proto.
WARNING:tensorflow:Model was compiled with an optimizer, but the optimizer is not from `tf.train` (e.g. `tf.train.AdagradOptimizer`). Only the serving graph was exported. The train and evaluate graphs were not added to the SavedModel.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:205: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['serving_default']
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: None
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: keras_export/1553710367/saved_model.pb
Model exported to:  b'keras_export/1553710367'

You may export a SavedModel directory to your local filesystem or to Cloud Storage, as long as you have the necessary permissions. In your current environment, you granted access to Cloud Storage by authenticating your GCP account and setting the GOOGLE_APPLICATION_CREDENTIALS environment variable. AI Platform training jobs can also export directly to Cloud Storage, because AI Platform service accounts have access to Cloud Storage buckets in their own project.

Try exporting directly to Cloud Storage:


In [0]:
# Export the model to a SavedModel directory in Cloud Storage
export_path = tf.keras.experimental.export_saved_model(keras_model, JOB_DIR + '/keras_export')
print("Model exported to: ", export_path)


WARNING:tensorflow:This model was compiled with a Keras optimizer (<tensorflow.python.keras.optimizers.RMSprop object at 0x7fc198c4e400>) but is being saved in TensorFlow format with `save_weights`. The model's weights will be saved, but unlike with TensorFlow optimizers in the TensorFlow format the optimizer's state will not be saved.

Consider using a TensorFlow optimizer from `tf.train`.
WARNING:tensorflow:Model was compiled with an optimizer, but the optimizer is not from `tf.train` (e.g. `tf.train.AdagradOptimizer`). Only the serving graph was exported. The train and evaluate graphs were not added to the SavedModel.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['serving_default']
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: None
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: gs://<your-bucket-name>/keras-job-dir/keras_export/1553710379/saved_model.pb
Model exported to:  b'gs://<your-bucket-name>/keras-job-dir/keras_export/1553710379'

You can now deploy this model to AI Platform and serve predictions by following the steps from Part 2.

Cleaning up

To clean up all GCP resources used in this project, you can delete the GCP project you used for the tutorial.

Alternatively, you can clean up individual resources by running the following commands:


In [0]:
# Delete model version resource
! gcloud ai-platform versions delete $MODEL_VERSION --quiet --model $MODEL_NAME 

# Delete model resource
! gcloud ai-platform models delete $MODEL_NAME --quiet

# Delete Cloud Storage objects that were created
! gsutil -m rm -r $JOB_DIR

# If the training job is still running, cancel it
! gcloud ai-platform jobs cancel $JOB_NAME --quiet --verbosity critical


Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553709421/#1553709423098988...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/#1553709422852130...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553709421/variables/checkpoint#1553709429999272...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553709421/variables/#1553709428368512...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553709421/assets/#1553709430948151...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553709421/variables/variables.data-00000-of-00001#1553709428975750...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553709421/assets/saved_model.json#1553709431121952...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553709421/variables/variables.index#1553709429461522...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553709421/saved_model.pb#1553709430502605...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553710379/#1553710381998179...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553710379/assets/#1553710395035632...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553710379/assets/saved_model.json#1553710395421499...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553710379/saved_model.pb#1553710394249816...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553710379/variables/#1553710390778836...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553710379/variables/checkpoint#1553710393369087...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553710379/variables/variables.data-00000-of-00001#1553710391609457...
Removing gs://<your-bucket-name>/keras-job-dir/keras_export/1553710379/variables/variables.index#1553710392464814...
Removing gs://<your-bucket-name>/keras-job-dir/packages/dcc159f40836cff74a27866227b327b0a8ccb5266194e76cff5368266b6d1cdd/trainer-0.0.0.tar.gz#1553709266664674...
/ [18/18 objects] 100% Done                                                     
Operation completed over 18 objects.                                             

If your Cloud Storage bucket doesn't contain any other objects and you would like to delete it, run gsutil rm -r gs://$BUCKET_NAME.

What's next?