In [ ]:
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
{Include a paragraph or two explaining what this example demonstrates, who should be interested in it, and what you need to know before you get started.}
{Include a paragraph with Dataset information and where to obtain it}
In this notebook, you will learn how to {Complete the sentence explaining briefly what you will learn from the notebook, example ML Training, HP tuning, Serving} The steps performed include:
* { add high level bullets for the steps of what you will perform in the notebook }
This tutorial uses billable components of Google Cloud Platform (GCP):
Learn about Cloud AI Platform pricing and Cloud Storage pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.
In [ ]:
%pip install -U missing_or_updating_package --user
In [ ]:
# Automatically restart kernel after installs
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)
The following steps are required, regardless of your notebook environment.
Select or create a GCP project. When you first create an account, you get a $300 free credit towards your compute/storage costs.
Google Cloud SDK is already installed in AI Platform Notebooks.
Enter your project ID in the cell below. Then run the cell to make sure the Cloud SDK uses the right project for all the commands in this notebook.
Note: Jupyter runs lines prefixed with !
as shell commands, and it interpolates Python variables prefixed with $
into these commands.
In [ ]:
# Get your GCP project id from gcloud
shell_output=!gcloud config list --format 'value(core.project)' 2>/dev/null
PROJECT_ID=shell_output[0]
print("Project ID: ", PROJECT_ID)
Otherwise, set your project id here.
In [ ]:
if PROJECT_ID == "" or PROJECT_ID is None:
PROJECT_ID = "[your-project-id]" #@param {type:"string"}
In [ ]:
import sys, os
# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your Google Cloud account. This provides access
# to your Cloud Storage bucket and lets you submit training jobs and prediction
# requests.
# If on AI Platform, then don't execute this code
if not os.path.exists('/opt/deeplearning/metadata/env_version'):
if 'google.colab' in sys.modules:
from google.colab import auth as google_auth
google_auth.authenticate_user()
# If you are running this tutorial in a notebook locally, replace the string
# below with the path to your service account key and run this cell to
# authenticate your Google Cloud account.
else:
%env GOOGLE_APPLICATION_CREDENTIALS your_path_to_credentials.json
# Log in to your account on Google Cloud
! gcloud auth login
The following steps are required, regardless of your notebook environment.
When you submit a training job using the Cloud SDK, you upload a Python package containing your training code to a Cloud Storage bucket. AI Platform runs the code from this package. In this tutorial, AI Platform also saves the trained model that results from your job in the same bucket. You can then create an AI Platform model version based on this output in order to serve online predictions.
Set the name of your Cloud Storage bucket below. It must be unique across all Cloud Storage buckets.
You may also change the REGION
variable, which is used for operations
throughout the rest of this notebook. Make sure to choose a region where Cloud
AI Platform services are
available. You may
not use a Multi-Regional Storage bucket for training with AI Platform.
In [ ]:
BUCKET_NAME = "[your-bucket-name]" #@param {type:"string"}
REGION = "us-central1" #@param {type:"string"}
Only if your bucket doesn't already exist: Run the following cell to create your Cloud Storage bucket.
In [ ]:
! gsutil mb -l $REGION gs://$BUCKET_NAME
Finally, validate access to your Cloud Storage bucket by examining its contents:
In [ ]:
! gsutil ls -al gs://$BUCKET_NAME
{Put all your imports and installs up into a setup section.}
In [ ]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import numpy as np
import os, sys
The tips below are specific to notebooks for Tensorflow/Scikit-Learn/PyTorch/XGBoost code.
Writing readable code here is critical. Specially when working with Notebooks: This will help other people, to read and understand your code. Having guidelines that you follow and recognize will make it easier for others to read your code.
Use the highest level API that gets the job done (unless the goal is to demonstrate the low level API). For example, when using Tensorflow:
Use TF.keras.Sequential > keras functional api > keras model subclassing > ...
Use model.fit > model.train_on_batch > manual GradientTapes.
Use eager-style code.
Use tensorflow_datasets and tf.data where possible.
Use an imperative style. "Run a batch of images through the model."
Use sentence case in titles/headings.
Use short titles/headings: "Download the data", "Build the Model", "Train the model".
In [ ]:
#Build the model
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation='relu', input_shape=(None, 5)),
tf.keras.layers.Dense(3)
])
In [ ]:
# Run the model on a single batch of data, and inspect the output.
result = model(tf.constant(np.random.randn(10,5), dtype = tf.float32)).numpy()
print("min:", result.min())
print("max:", result.max())
print("mean:", result.mean())
print("shape:", result.shape)
In [ ]:
# Compile the model for training
model.compile(optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.categorical_crossentropy)
Keep examples quick. Use small datasets, or small slices of datasets. You don't need to train to convergence, train until it's obvious it's making progress.
For a large example, don't try to fit all the code in the notebook. Add python files to tensorflow examples, and in the notebook run:
! pip install git+https://github.com/tensorflow/examples
In [ ]:
# Clone repo
! git clone https://github.com/GoogleCloudPlatform/professional-services.git
% cd professional-services/examples/cloudml-energy-price-forecasting/
! sed -i 's/energyforecast\/data/ai-platform-data\/energy_data/g' trainer/task.py
To clean up all GCP resources used in this project, you can delete the GCP project you used for the tutorial.
{Include commands to delete individual resources below}
In [ ]:
# Delete model version resource
! gcloud ai-platform versions delete $MODEL_VERSION --quiet --model $MODEL_NAME
# Delete model resource
! gcloud ai-platform models delete $MODEL_NAME --quiet
# Delete Cloud Storage objects that were created
! gsutil -m rm -r $JOB_DIR
# If training job is still running, cancel it
! gcloud ai-platform jobs cancel $JOB_NAME --quiet