In [ ]:
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
{Include a paragraph or two explaining what this example demonstrates, who should be interested in it, and what you need to know before you get started.}
{Include a paragraph with Dataset information and where to obtain it}
In this notebook, you will learn how to {Complete the sentence explaining briefly what you will learn from the notebook, example ML Training, HP tuning, Serving} The steps performed include:
* { add high level bullets for the steps of what you will perform in the notebook }
Example:
This tutorial uses billable components of Google Cloud Platform (GCP):
Learn about Cloud AI Platform pricing and Cloud Storage pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.
Otherwise, make sure your environment meets this notebook's requirements. You need the following:
The Google Cloud guide to Setting up a Python development environment and the Jupyter installation guide provide detailed instructions for meeting these requirements. The following steps provide a condensed set of instructions:
Install virtualenv and create a virtual environment that uses Python 3.
Activate that environment and run pip install jupyter
in a shell to install
Jupyter.
Run jupyter notebook
in a shell to launch Jupyter.
Open this notebook in the Jupyter Notebook Dashboard.
In [ ]:
%pip install -U missing_or_updating_package --user
In [ ]:
# Automatically restart kernel after installs
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)
The following steps are required, regardless of your notebook environment.
Select or create a GCP project.. When you first create an account, you get a $300 free credit towards your compute/storage costs.
If you are running this notebook locally, you will need to install Google Cloud SDK.
Enter your project ID in the cell below. Then run the cell to make sure the Cloud SDK uses the right project for all the commands in this notebook.
Note: Jupyter runs lines prefixed with !
as shell commands, and it interpolates Python variables prefixed with $
into these commands.
In [ ]:
# Get your GCP project id from gcloud
shell_output=!gcloud config list --format 'value(core.project)' 2>/dev/null
PROJECT_ID=shell_output[0]
print("Project ID: ", PROJECT_ID)
Otherwise, set your project id here.
In [ ]:
if PROJECT_ID == "" or PROJECT_ID is None:
PROJECT_ID = "[your-project-id]" #@param {type:"string"}
If you are using Colab, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.
Otherwise, follow these steps:
In the GCP Console, go to the Create service account key page.
From the Service account drop-down list, select New service account.
In the Service account name field, enter a name.
From the Role drop-down list, select Machine Learning Engine > AI Platform Admin and Storage > Storage Object Admin.
Click Create. A JSON file that contains your key downloads to your local environment.
Enter the path to your service account key as the
GOOGLE_APPLICATION_CREDENTIALS
variable in the cell below and run the cell.
In [ ]:
import sys, os
# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.
# If on AI Platform, then don't execute this code
if not os.path.exists('/opt/deeplearning/metadata/env_version'):
if 'google.colab' in sys.modules:
from google.colab import auth as google_auth
google_auth.authenticate_user()
# If you are running this notebook locally, replace the string below with the
# path to your service account key and run this cell to authenticate your GCP
# account.
else:
%env GOOGLE_APPLICATION_CREDENTIALS ''
The following steps are required, regardless of your notebook environment.
When you submit a training job using the Cloud SDK, you upload a Python package containing your training code to a Cloud Storage bucket. AI Platform runs the code from this package. In this tutorial, AI Platform also saves the trained model that results from your job in the same bucket. You can then create an AI Platform model version based on this output in order to serve online predictions.
Set the name of your Cloud Storage bucket below. It must be unique across all Cloud Storage buckets.
You may also change the REGION
variable, which is used for operations
throughout the rest of this notebook. Make sure to choose a region where Cloud
AI Platform services are
available. You may
not use a Multi-Regional Storage bucket for training with AI Platform.
In [ ]:
BUCKET_NAME = "[your-bucket-name]" #@param {type:"string"}
REGION = "us-central1" #@param {type:"string"}
Only if your bucket doesn't already exist: Run the following cell to create your Cloud Storage bucket.
In [ ]:
! gsutil mb -l $REGION gs://$BUCKET_NAME
Finally, validate access to your Cloud Storage bucket by examining its contents:
In [ ]:
! gsutil ls -al gs://$BUCKET_NAME
{Put all your imports and installs up into a setup section.}
In [ ]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import numpy as np
import sys, os
The tips below are specific to notebooks for Tensorflow/Scikit-Learn/PyTorch/XGBoost code.
Writing readable code here is critical. Specially when working with Notebooks: This will help other people, to read and understand your code. Having guidelines that you follow and recognize will make it easier for others to read your code.
Use the highest level API that gets the job done (unless the goal is to demonstrate the low level API). For example, when using Tensorflow:
Use TF.keras.Sequential > keras functional api > keras model subclassing > ...
Use model.fit > model.train_on_batch > manual GradientTapes.
Use eager-style code.
Use tensorflow_datasets and tf.data where possible.
Use an imperative style. "Run a batch of images through the model."
Use sentence case in titles/headings.
Use short titles/headings: "Download the data", "Build the Model", "Train the model".
In [ ]:
#Build the model
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation='relu', input_shape=(None, 5)),
tf.keras.layers.Dense(3)
])
In [ ]:
# Run the model on a single batch of data, and inspect the output.
result = model(tf.constant(np.random.randn(10,5), dtype = tf.float32)).numpy()
print("min:", result.min())
print("max:", result.max())
print("mean:", result.mean())
print("shape:", result.shape)
In [ ]:
# Compile the model for training
model.compile(optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.categorical_crossentropy)
Keep examples quick. Use small datasets, or small slices of datasets. You don't need to train to convergence, train until it's obvious it's making progress.
For a large example, don't try to fit all the code in the notebook. Add python files to tensorflow examples, and in the notebook run: ! pip install git+https://github.com/tensorflow/examples
To clean up all GCP resources used in this project, you can delete the GCP project you used for the tutorial.
{Include commands to delete individual resources below}
In [ ]:
# Delete model version resource
! gcloud ai-platform versions delete $MODEL_VERSION --quiet --model $MODEL_NAME
# Delete model resource
! gcloud ai-platform models delete $MODEL_NAME --quiet
# Delete Cloud Storage objects that were created
! gsutil -m rm -r $JOB_DIR
# If training job is still running, cancel it
! gcloud ai-platform jobs cancel $JOB_NAME --quiet