In [ ]:
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Overview

{Include a paragraph or two explaining what this example demonstrates, who should be interested in it, and what you need to know before you get started.}

Dataset

{Include a paragraph with Dataset information and where to obtain it}

Objective

In this notebook, you will learn how to {Complete the sentence explaining briefly what you will learn from the notebook, example ML Training, HP tuning, Serving} The steps performed include:

* { add high level bullets for the steps of what you will perform in the notebook }

Costs

Example:

This tutorial uses billable components of Google Cloud Platform (GCP):

  • Cloud AI Platform
  • Cloud Storage

Learn about Cloud AI Platform pricing and Cloud Storage pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.

Set up your local development environment

If you are using Colab or AI Platform Notebooks, your environment already meets all the requirements to run this notebook. You can skip this step.

Otherwise, make sure your environment meets this notebook's requirements. You need the following:

  • The Google Cloud SDK
  • Git
  • Python 3
  • virtualenv
  • Jupyter notebook running in a virtual environment with Python 3

The Google Cloud guide to Setting up a Python development environment and the Jupyter installation guide provide detailed instructions for meeting these requirements. The following steps provide a condensed set of instructions:

  1. Install and initialize the Cloud SDK.

  2. Install Python 3.

  3. Install virtualenv and create a virtual environment that uses Python 3.

  4. Activate that environment and run pip install jupyter in a shell to install Jupyter.

  5. Run jupyter notebook in a shell to launch Jupyter.

  6. Open this notebook in the Jupyter Notebook Dashboard.

Installation

Install additional dependencies not installed in Notebook environment (e.g. XGBoost, adanet, tf-hub)

  • Use the latest major GA version of the framework.

In [ ]:
%pip install -U missing_or_updating_package --user

Restart the Kernel

Once you've installed the {packages}, you need to restart the notebook kernel so it can find the packages.


In [ ]:
# Automatically restart kernel after installs
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

Before you begin

GPU run-time

Make sure you're running this notebook in a GPU runtime if you have that option. In Colab, select Runtime --> Change runtime type

Follow before you begin in Guide

{ add link to any online before you begin tutorial on the product }

Set up your GCP project

The following steps are required, regardless of your notebook environment.

  1. Select or create a GCP project.. When you first create an account, you get a $300 free credit towards your compute/storage costs.

  2. Make sure that billing is enabled for your project.

  3. Enable the AI Platform APIs and Compute Engine APIs.

  4. If you are running this notebook locally, you will need to install Google Cloud SDK.

  5. Enter your project ID in the cell below. Then run the cell to make sure the Cloud SDK uses the right project for all the commands in this notebook.

Note: Jupyter runs lines prefixed with ! as shell commands, and it interpolates Python variables prefixed with $ into these commands.

Project ID

If you don't know your project ID, you may be able to get your PROJECT_ID using gcloud.


In [ ]:
# Get your GCP project id from gcloud
shell_output=!gcloud config list --format 'value(core.project)' 2>/dev/null
PROJECT_ID=shell_output[0]
print("Project ID: ", PROJECT_ID)

Otherwise, set your project id here.


In [ ]:
if PROJECT_ID == "" or PROJECT_ID is None:
    PROJECT_ID = "[your-project-id]" #@param {type:"string"}

Authenticate your GCP account

If you are using AI Platform Notebooks, your environment is already authenticated. Skip this step.

If you are using Colab, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.

Otherwise, follow these steps:

  1. In the GCP Console, go to the Create service account key page.

  2. From the Service account drop-down list, select New service account.

  3. In the Service account name field, enter a name.

  4. From the Role drop-down list, select Machine Learning Engine > AI Platform Admin and Storage > Storage Object Admin.

  5. Click Create. A JSON file that contains your key downloads to your local environment.

  6. Enter the path to your service account key as the GOOGLE_APPLICATION_CREDENTIALS variable in the cell below and run the cell.


In [ ]:
import sys, os

# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

# If on AI Platform, then don't execute this code
    if not os.path.exists('/opt/deeplearning/metadata/env_version'):
    if 'google.colab' in sys.modules:
      from google.colab import auth as google_auth
      google_auth.authenticate_user()

    # If you are running this notebook locally, replace the string below with the
    # path to your service account key and run this cell to authenticate your GCP
    # account.
    else:
      %env GOOGLE_APPLICATION_CREDENTIALS ''

Create a Cloud Storage bucket

The following steps are required, regardless of your notebook environment.

When you submit a training job using the Cloud SDK, you upload a Python package containing your training code to a Cloud Storage bucket. AI Platform runs the code from this package. In this tutorial, AI Platform also saves the trained model that results from your job in the same bucket. You can then create an AI Platform model version based on this output in order to serve online predictions.

Set the name of your Cloud Storage bucket below. It must be unique across all Cloud Storage buckets.

You may also change the REGION variable, which is used for operations throughout the rest of this notebook. Make sure to choose a region where Cloud AI Platform services are available. You may not use a Multi-Regional Storage bucket for training with AI Platform.


In [ ]:
BUCKET_NAME = "[your-bucket-name]" #@param {type:"string"}
REGION = "us-central1" #@param {type:"string"}

Only if your bucket doesn't already exist: Run the following cell to create your Cloud Storage bucket.


In [ ]:
! gsutil mb -l $REGION gs://$BUCKET_NAME

Finally, validate access to your Cloud Storage bucket by examining its contents:


In [ ]:
! gsutil ls -al gs://$BUCKET_NAME

Import libraries and define constants

{Put all your imports and installs up into a setup section.}


In [ ]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

import numpy as np

import sys, os

Notes

The tips below are specific to notebooks for Tensorflow/Scikit-Learn/PyTorch/XGBoost code.

General

  • Include the collapsed license at the top (this uses Colab's "Form" mode to hide the cells).
  • Only include a single H1 title.
  • Include the button-bar immediately under the H1.
  • Include an overview section before any code.
  • Put all your installs and imports in a setup section.
  • Always include the three future imports.
  • Save the notebook with the Table of Contents open.
  • Write python3 compatible code.
  • Keep cells small (~max 20 lines).

Python Style guide

  • As Guido van Rossum said, “Code is read much more often than it is written”. Please make sure you are following the guidelines to write Python code from the Python style guide.
  • Writing readable code here is critical. Specially when working with Notebooks: This will help other people, to read and understand your code. Having guidelines that you follow and recognize will make it easier for others to read your code.

  • Google Python Style guide

Code content

Use the highest level API that gets the job done (unless the goal is to demonstrate the low level API). For example, when using Tensorflow:

  • Use TF.keras.Sequential > keras functional api > keras model subclassing > ...

  • Use model.fit > model.train_on_batch > manual GradientTapes.

  • Use eager-style code.

  • Use tensorflow_datasets and tf.data where possible.

Text

  • Use an imperative style. "Run a batch of images through the model."

  • Use sentence case in titles/headings.

  • Use short titles/headings: "Download the data", "Build the Model", "Train the model".

Code Style

  • Notebooks are for people. Write code optimized for clarity.

  • Demonstrate small parts before combining them into something more complex. Like below:


In [ ]:
#Build the model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(10, activation='relu', input_shape=(None, 5)),
    tf.keras.layers.Dense(3)
])

In [ ]:
# Run the model on a single batch of data, and inspect the output.
result = model(tf.constant(np.random.randn(10,5), dtype = tf.float32)).numpy()

print("min:", result.min())
print("max:", result.max())
print("mean:", result.mean())
print("shape:", result.shape)

In [ ]:
# Compile the model for training
model.compile(optimizer=tf.keras.optimizers.Adam(),
              loss=tf.keras.losses.categorical_crossentropy)
  • Keep examples quick. Use small datasets, or small slices of datasets. You don't need to train to convergence, train until it's obvious it's making progress.

  • For a large example, don't try to fit all the code in the notebook. Add python files to tensorflow examples, and in the notebook run: ! pip install git+https://github.com/tensorflow/examples

Cleaning up

To clean up all GCP resources used in this project, you can delete the GCP project you used for the tutorial.

{Include commands to delete individual resources below}


In [ ]:
# Delete model version resource
! gcloud ai-platform versions delete $MODEL_VERSION --quiet --model $MODEL_NAME 

# Delete model resource
! gcloud ai-platform models delete $MODEL_NAME --quiet

# Delete Cloud Storage objects that were created
! gsutil -m rm -r $JOB_DIR

# If training job is still running, cancel it
! gcloud ai-platform jobs cancel $JOB_NAME --quiet