Some important links to keep open during the workshop – open these tabs now!:
TF documentation : Use the search box (top right) to get documentation on Tensorflow's rich API.
solutions/ : Every notebook in the exercises/
directory has a corresponding notebook in the solutions/
directory.
Colaboratory (Colab) is a Jupyter notebook environment which allows you to work with data and code in an interactive manner. You can decide where you want to run your code:
It supports Python 3 and comes with a set of pre-installed libraries like Tensorflow and Matplotlib but also gives you the option to install more libraries on demand. The resulting notebooks can be shared in a straightforward way.
Caveats:
Getting started
connect
in the top right corner if you don't already see a green checkmark there.Action | Colab Shortcut | Jupyter Shortcut |
---|---|---|
Executes current cell | <CTRL-ENTER> |
<CTRL-ENTER> |
Executes current cell and moves to next cell | <SHIFT-ENTER> |
S<HIFT-ENTER> |
Executes current selection | <CTRL-SHIFT-ENTER> |
N/A |
Insert cell above | <CTRL-M> <A> |
<A> |
Append cell below | <CTRL-M> <B> |
<B> |
Shows searchable command palette | <CTRL-SHIFT-P> |
<CTRL-SHIFT-P> |
Convert cell to code | <CTRL-M> <Y> |
<Y> |
Convert cell to Markdown | <CTRL-M> <M> |
<M> |
Autocomplete (on by default) | <CTRL+SPACE> |
<TAB> |
Goes from edit to "command" mode | <ESC> |
<ESC> |
Goes from "command" to edit mode | <ENTER> |
<ENTER> |
Show keyboard shortcuts | <CTRL-M> <H> |
<H> |
Note: On OS X you can use `
Give it a try!
In [ ]:
# YOUR ACTION REQUIRED:
# Execute this cell first using <CTRL-ENTER> and then using <SHIFT-ENTER>.
# Note the difference in which cell is selected after execution.
print('Hello world!')
You can also only execute one single statement in a cell.
In [ ]:
# YOUR ACTION REQUIRED:
# Execute only the first print statement by selecting the first line and pressing
# <CTRL-SHIFT-ENTER>.
print('Only print this line.')
print('Avoid printing this line.')
What to do if you get stuck
If you should get stuck and the documentation doesn't help you consider using additional help.
In [ ]:
def xor_str(a, b):
return ''.join([chr(ord(a[i % len(a)]) ^ ord(b[i % len(b)]))
for i in range(max(len(a), len(b)))])
# YOUR ACTION REQUIRED:
# Try to find the correct value for the variable below.
workshop_secret = '(replace me!)'
xor_str(workshop_secret,
'\x03\x00\x02\x10\x00\x1f\x03L\x1b\x18\x00\x06\x07\x06K2\x19)*S;\x17\x08\x1f\x00\x05F\x1e\x00\x14K\x115\x16\x07\x10\x1cR1\x03\x1d\x1cS\x1a\x00\x13J')
# Hint: You might want to checkout the ../solutions directory
# (you should already have opened this directory in a browser tab :-)
We'll be using TensorFlow 2.1.0 in this workshop. This will soon be the default, but for the time being we still need to activate it with the Colab-specific %tensorflow_version
magic.
In [ ]:
# We must call this "magic" before importing TensorFlow. We will explain
# further down what "magics" (starting with %) are.
%tensorflow_version 2.x
In [ ]:
# Include basic dependencies and display the tensorflow version.
import tensorflow as tf
tf.__version__
In [ ]:
# Print the current working directory and list all files in it.
!pwd
!ls
In [ ]:
# Especially useful: Installs new packages.
!pip install qrcode
import qrcode
qrcode.make('Colab rocks!')
Autocompletion and docstrings
Jupyter shows possible completions of partially typed commands.
Try it for yourself by displaying all available tf.
methods that start with one
.
In [ ]:
# YOUR ACTION REQUIRED:
# Set the cursor to after tf.one and press <CTRL-SPACE>.
# On Mac, only <OPTION-ESCAPE> might work.
tf.one
In addition, you can also display docstrings to see the function signature and possible parameters.
In [ ]:
# YOUR ACTION REQUIRED:
# Complete the command to `tf.maximum` and then add the opening bracket "(" to
# see the function documentation.
tf.maximu
Alternatively, you might also inspect function details with docstrings if available by appending a "?".
In [ ]:
tf.maximum?
Note: This also works for any other type of object as can be seen below.
In [ ]:
test_dict = {'key0': 'Tensor', 'key1': 'Flow'}
test_dict?
As noted in the introduction above, Colab provides multiple runtimes with different hardware accelerators:
which can be selected by choosing "Runtime > Change runtime type"
in the menu.
Please be aware that selecting a new runtime will assign a new virtual machine (VM). In general, assume that any changes you make to the VM environment including data storage are ephemeral. Particularly, this might require to execute previous cells again as their content is unknown to a new runtime otherwise.
Let's take a closer look at one of such provided VMs.
Once we have been assigned a runtime we can inspect it further.
In [ ]:
# Display how long the system has been running.
# Note : this shows "0 users" because no user is logged in via SSH.
!uptime
As can be seen, the machine has been allocated just very recently for our purposes.
VM specifications
In [ ]:
# Display available and used memory.
!free -h
print("-"*70)
# Display the CPU specification.
!lscpu
print("-"*70)
# Display the GPU specification (if available).
!(nvidia-smi | grep -q "has failed") && echo "No GPU found!" || nvidia-smi
Matplotlib is one of the most famous Python plotting libraries and can be used to plot results within a cell's output (see Matplotlib Introduction).
Let's try to plot something with it.
In [ ]:
# Display the Matplotlib outputs within a cell's output.
%matplotlib inline
import numpy as np
from matplotlib import pyplot
# Create a randomized scatterplot using matplotlib.
x = np.random.rand(100).astype(np.float32)
noise = np.random.normal(scale=0.3, size=len(x))
y = np.sin(x * 7) + noise
pyplot.scatter(x, y)
Another declarative visualization library for Python is Altair (see Altair: Declarative Visualization in Python).
Try to zoom in/out and to hover over individual data points in the resulting plot below.
In [ ]:
# Load an example dataset.
from vega_datasets import data
cars = data.cars()
# Plot the dataset, referencing dataframe column names.
import altair as alt
alt.Chart(cars).mark_point().encode(
x='Horsepower',
y='Miles_per_Gallon',
color='Origin',
tooltip=['Name', 'Origin', 'Horsepower', 'Miles_per_Gallon']
).interactive()
The IPython and Colab environment support built-in magic commands called magics (see: IPython - Magics).
In addition to default Python, these commands might be handy for example when it comes to interacting directly with the VM or the Notebook itself.
In [ ]:
%%sh
echo "This is a shell script!"
# List all running VM processes.
ps -ef
echo "Done"
In [ ]:
# Embed custom HTML directly into a cell's output.
%%html
<marquee>HTML rocks</marquee>
You can also make use of line magics which can be inserted anywhere at the beginning of a line inside a cell and need to be prefixed with %.
Examples include:
For example, if you want to find out how long one specific line requires to be executed you can just prepend %time.
In [ ]:
n = 1000000
%time list1 = [i for i in range(n)]
print("")
%time list2 = [i for i in range(int(n/2))]
Note: Some line magics like %time can also be used for complete cells by writing %%time.
In [ ]:
%%time
n = 1000000
list1 = [i for i in range(n)]
list2 = [i for i in range(int(n/2))]
There are multiple ways to provide data to a Colabs's VM environment.
Note: This section only applies to Colab. Jupyter has a file explorer and other options for data handling.
The options include:
Uploading files from the local file system
If you need to manually upload files to the VM, you can use the files tab on the left. The files tab also allows you to browse the contents of the VM and when you double click on a file you'll see a small text editor on the right.
Connecting to Google Cloud Storage
Google Cloud Storage (GCS) is a cloud file storage service with a RESTful API.
We can utilize it to store our own data or to access data provided by the following identifier:
gs://[BUCKET_NAME]/[OBJECT_NAME]
We'll use the data provided in gs://amld-datasets/zoo_img as can be seen below.
Before we can interact with the cloud environment, we need to grant permissions accordingly (also see ).
In [ ]:
from google.colab import auth
auth.authenticate_user()
List a subset of the contained files using the gsutil tool.
In [ ]:
!gsutil ls gs://amld-datasets/zoo_img | head
Conveniently, TensorFlow natively supports multiple file systems such as:
An example for the GCS filesystem can be seen below.
In [ ]:
# Note: This cell hangs if you forget to call auth.authenticate_user() above.
tf.io.gfile.glob('gs://amld-datasets/zoo_img/*')[:10]
Finally, we can take a look at the snippets support in Colab.
If you're using Jupyter please see Jupyter contrib nbextensions - Snippets menu as this is not natively supported.
Snippets are a way to quickly "bookmark" pieces of code or text that you might want to insert into specific cells.
In [ ]:
# YOUR ACTION REQUIRED:
# Explore existing snippets by going to the `Code snippets` section.
# Click on the <> button on the left sidebar to open the snippets.
# Alternatively, you can press `<CTRL><ALT><P>` (or `<COMMAND><OPTION><P>` for
# OS X).
We have created some default snippets for this workshop in:
In order to use these snippets, you can:
As soon as you update the settings, the snippets will then become available in every Colab. Search for "amld" to quickly find them.
Alternatively, you can also add snippets via the API (but this needs to be done for every Colab/kernel):
In [ ]:
from google.colab import snippets
# snippets.register('https://colab.research.google.com/drive/1OFSjEmqC-UC66xs-LR7-xmgkvxYTrAcN')
Pro tip : Maybe this is a good moment to create your own snippets and register them in settings. You can then start collecting often-used code and have it ready when you need it... In this Colab you'll need to have text cells with titles (like ### snippet name
) preceeding the code cells.
In [ ]:
from IPython.core.magic import register_line_cell_magic
@register_line_cell_magic
def mymagic(line_content, cell_content=None):
print('line_content="%s" cell_content="%s"' % (line_content, cell_content))
In [ ]:
%mymagic Howdy Alice!
In [ ]:
%%mymagic simple question
Howdy Alice!
how are you?
In [ ]:
#@title Execute me
# Hidden cell content.
print("Double click the cell to see its content.")
In [ ]:
# Form example mostly taken from "Adding form fields" Snippet.
#@title Example form
#@markdown Specify some test data and execute this cell.
string_type = 'test_string' #@param {type: "string"}
slider_value = 145 #@param {type: "slider", min: 100, max: 200}
number = 1339 #@param {type: "number"}
date = '2019-01-26' #@param {type: "date"}
pick_me = "a" #@param ['a', 'b', 'c']
#@markdown ---
print("Submitted data:")
print(string_type, slider_value, number, date, pick_me)
An example of an IPython tool that you can utilize is the interactive debugger provided inside an IPython environment like Colab.
For instance, by using %pdb on, you can automatically trigger the debugger on exceptions to further analyze the state.
Some useful debugger commands are:
Description | Command |
---|---|
h(elp) | Display available commands |
p(rint) x |
Show content of object x |
w(here) | Show current instruction pointer position |
q(uit) | Leave the debugger |
In [ ]:
# YOUR ACTION REQUIRED:
# Execute this cell, print the variable contents of a, b and exit the debugger.
%pdb on
a = 67069 / 47 - 0x5a
b = a - 0x539
#c = a / b # Will throw an exception.
We'll not dive further into debugging but it's useful to know that this option exists.
Please see Python Docs - pdb The Python Debugger for more information.
While notebook environments like Colab/Jupyter provide many benefits, they also come with some caveats that you should be aware of. One example is that you might quickly execute cells in a wrong order leading to unexpected behavior.
If you're interested in more examples feel free to take a look at:
Youtube - I don't like notebooks by Joel Grus (duration ~56 minutes)