Licensed under the Apache License, Version 2.0 (the "License");


In [0]:
#@title Licensed under the Apache License, Version 2.0 (the "License"); { display-mode: "form" }
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Introduction to TensorFlow Part 1 - Basics

View source on GitHub

In [0]:
#@title Upgrade to TensorFlow 2.1+
!pip install --upgrade tensorflow

In [0]:
#@title Install Libraries for this colab
!pip install scipy
!pip install matplotlib
import seaborn as sns
import tensorflow as tf
# Load utility to save the graph in a tensorboard-friendly way
from tensorflow.python.summary.writer.writer import FileWriter
from PIL import Image
%load_ext tensorboard

What this notebook covers

The aim of this mini course is to get you up and running with TensorFlow. It's not just a walkthrough. To get the most out of this, you should follow along by running the examples below. We cover just the basics of TensorFlow.

  • Underlying concepts of Tensors, shapes, Operations, graphs etc.
  • The functionality available in the core TensorFlow API.
  • Not really concerned with Machine Learning. While TensorFlow is widely associated with ML, it can (and was originally designed to) be used to accelerate most traditional numerical calculations.

Tips And References

  • The TensorFlow API reference is indispensable while you learn the ropes. Tensorflow Python API
  • You will need to refer to it through this course.
  • This notebook uses Google Colab - allowing you to run code interactively. See for more information.
  • In Colab you can use the autocomplete feature to see the available functions and the usage.
    • To see all the names available under some module (e.g. tf.constant under tf), type the name of the module followed by a period and hit Ctrl+Space.
    • To see the usage instructions for a function, type the function name followed by an open bracket and hit Tab.

In [0]:
#@title
%%html
<div style="text-align:center; font-size: 100px; opacity: 0.9; color:orange">Let's Begin</div>

What is TensorFlow

TensorFlow is a system designed for efficient numerical computations.

  • The core of TensorFlow is written in C++.
  • The client API is most commonly developed in Python and other languages are supported.
  • A TensorFlow program describes the computation as a sequence of operations.
  • The operations (also called Ops) act on zero or more inputs. The inputs are called Tensors.
  • The outputs of ops are also Tensors.
  • You can think of tensors as the data and the ops as the functions that act on that data.
  • Output of one op can be sent as input to another op.
  • A TensorFlow program is a directed graph whose edges are tensors and the nodes are Ops.
  • The tensors 'flow' along the edges, hence, TensorFlow.

Let us consider an example.


In [0]:
# Define a function that returns sin(a + b). We add names for each of the
# tensor and operation
def compute_sin(a, b):
  """Computes sine of a sum of two numbers."""
  # a, b, c and d below are tensors.
  a = tf.convert_to_tensor(a, name = "a")
  b = tf.convert_to_tensor(b, name = "b")

  # This creates an output tensor of the 'Op' tf.add.
  # The tensor's name is 'addition', which is what you'll see in the graph
  # below. And we store the tensor in a python variable called 'c'
  c = tf.add(a, b, name="addition")

  # The Op 'tf.sin' consumes the input tensor 'c' and returns another tensor 'd'
  d = tf.sin(c, name="sin")
  return d

# Returns sin(1.0 + 2.0)
compute_sin(1.0, 2.0)

In [0]:
# Computation graph visualization
g = tf.Graph()  # This is the graph to be saved
with g.as_default():
  with tf.name_scope('main_graph'):
    c = compute_sin(1.0, 2.0)  # Add calculations to the graph 
  writer = FileWriter("logs", g)
  writer.close()

# Invoke tensorboard to visualize the graph
%tensorboard --logdir=logs

Execution Modes

TensorFlow has two modes of execution: deferred and (since v2) eager.

Deferred Execution

In order to access the deferred execution, eager execution has to be disabled

tf.compat.v1.disable_eager_execution()

With deferred mode, you build up a graph of operations, so this tensorflow code

a = tf.constant(1.0)
b = tf.constant(2.0)
c = tf.add(a, b)

is roughly equivalent to this python code

a = 1
b = 2
def c():
  return a + b

In the python code, c does not hold the value 3: it holds a function. In order to get the value 3, you need to execute the function:

c() # returns the value 3

In the same way with TensorFlow in deferred (graph) mode, c doesn't hold a tensor with the value 3, it holds an execution graph that can be executed inside a Session to obtain the value 3, as per below


In [0]:
import tensorflow as tf
tf.compat.v1.disable_eager_execution()
a = tf.constant([1.0], name = "a")
b = tf.constant([2.0], name = "b")
c = tf.add(a, b, name="addition")

print(c)
with tf.compat.v1.Session() as sess:
  result = sess.run(c)
print(result)

While the graph is being built, TensorFlow does little more than check that the operations have been passed the right numbers of arguments and that they're of the right type and shape. Once the graph has been built, it can be

  • serialized, stored, repeatedly executed
  • analyzed and optimised: removing duplicate operations, hoisting constants up levels etc.
  • split across multiple CPUs, GPUs and TPUs
  • etc. This allows for a lot of flexibility and performance for long calculations. However when manually experimenting this mode does get in the way of the REPL behaviour that python developers are used to. Thus in TensorFlow v2, eager execution was added.

In order to bridge the performance gap incurred by optimization done by graph mode, one can wrap calculations into tf.function which makes the computations executed in a single graph scope.

tf.compat.v1.Session and tf.function are discussed in more detail later on

Eager Execution

Eager execution works in a more intuitive fashion. Operations are executed immediately, rather than being stored as deferred instructions. With eager mode enabled, the following

import tensorflow as tf

a = tf.constant([1.0], name = "a")
b = tf.constant([2.0], name = "b")
c = tf.add(a, b, name="addition")

print (c)

would produce the output

tf.Tensor([3.], shape=(1,), dtype=float32)

Eager execution is enabled by defualt. Note that the choice of execution mode is an irrevocable, global one. It must be done before any TensorFlow method is called, once your process is using TensorFlow in one mode, it cannot switch to the other without restarting the kernel.

Tensors and Shapes

  • A tensor is just an $n$-dimensional matrix.
  • The rank of a tensor is the number of dimensions it has (alternatively, the number of indices you have to specify to get at an element).
  • A vector is a rank $\color{blue} 1$ tensor.
  • A matrix is a rank $\color{blue} 2$ tensor.
  • Tensor is characterized by its shape and the data type of its elements.
  • The shape is a specification of the number of dimensions and the length of the tensor in each of those dimensions.
  • Shape is described by an integer vector giving the lengths in each dimension.
  • For example, $\left[\begin{array}{cc} 1 & 0 \\ 0 & 1 \\ 1 & 1 \end{array}\right]$ is tensor of shape [3, 2].
  • On the other hand, $\left[\begin{array}{cc} [1] & [0] \\ [0] & [1] \\ [1] & [1] \end{array}\right]$ is a tensor of shape [3, 2, 1].
  • The shape is read starting from the "outside" and moving in until you reach
    an elementary object (e.g. number or string).
  • Note that Tensors are not just arbitrary arrays. For example, $[1, [2]]$ is not a Tensor and has no unambiguous shape.
  • TensorFlow shapes are almost the same as numpy shapes.

In [0]:
#@title Fun with shapes

import tensorflow as tf
import numpy as np

# This is equivalent to a 0-rank tensor (i.e. a scalar).
x = np.array(2.0)
t = tf.constant(x)
print ("Shape of %s = %s\n" % (x, t.shape))

# A rank 1 tensor. Shape = [5]
x = np.array([1, 2, 3, 4, 5])
t = tf.constant(x)
print ("Shape of %s: %s\n" % (x, t.shape))

# A rank 2 tensor. Shape = [5, 1]
x = np.array([[1], [2], [3], [4], [5]])
t = tf.constant(x)
print ("Shape of %s: %s\n" % (x, t.shape))

# A rank 2 tensor. Shape = [1, 5]
x = np.array([[1, 2, 3, 4, 5]])
t = tf.constant(x)
print ("Shape of %s: %s\n" % (x, t.shape))

# A rank 3 tensor. Shape = [2, 1, 2]
x = np.array(
    [
        [ [0, 0] ],
        [ [0, 0] ]
    ])

t = tf.constant(x)
print ("Shape of %s: %s\n" % (x, t.shape))

In [0]:
#@title Shape Quiz

# to-do: Fill in an array of shape [1, 2, 1, 2] in the variable x. 
# The values you choose don't matter but the shape does.

x = np.array([])
t = tf.constant(x, name = "t")


if t.shape == [1, 2, 1, 2]:
  print ("Success!")
else:
  print ("Shape was %s. Try again"%t.shape)

In [0]:
#@title Solution: Shape Quiz. Double click to reveal

import numpy as np
import tensorflow as tf

# The values you choose don't matter but the shape does.

x = np.array([ [[[0, 0]], [[0, 0]]] ] )
t = tf.constant(x)

if t.shape == [1, 2, 1, 2]:
  print ("Success!")
else:
  print ("Shape was %s. Try again"%t.shape)

Shape And Reshape

Most Tensorflow operations preserve the tensor shapes or modify it in obvious ways.
However you often need to rearrange the shape to fit the problem at hand.

There are number of shape related ops in TensorFlow that you can make use of.
First we have these ops to give us information about the tensor shape

Name Description
tf.shape Returns the shape of the tensor
tf.size Returns the total number of elements in the tensor
tf.rank Returns the tensor rank

In [0]:
#@title Shape Information Ops

import numpy as np

# These examples are a little silly because we already know
# the shapes. 
x = tf.constant(np.zeros([2, 2, 3, 12]))

shape_x = tf.shape(x, name="my_shape")
print("Shape of x: %s" % shape_x)

rank_x = tf.rank(x)
print("Rank of x: %s" % rank_x)

size_x = tf.size(x)
print("Size of x: %s" % size_x)

NB: The hawkeyed amongst us would have noticed that there seem to be two different
shape methods. In the examples on the previous slide we saw tensor.shape property
and above we saw tf.shape(tensor). There are subtle differences between the two
which we will discuss more when we talk about placeholders.

Reshaping Continued

Coming back to the ops that you can use to modify the shape, the following table lists some of them.

Name Description
tf.reshape Reshapes a tensor while preserving number of elements
tf.squeeze "Squeezes out" dimensions of length 1
tf.expand_dims Inverse of squeeze. Expands the dimension by 1
tf.transpose Permutes the dimensions. For matrices, performs usual matrix transpose.
tf.meshgrid Effectively creates an N dimensional grid from N one dimensional arrays.

The following example demonstrates the use of the reshape op.


In [0]:
#@title Reshaping Tensors

import numpy as np

# Create a constant tensor of shape [12]
x = tf.constant(np.arange(1, 13))
print("x = %s\n" % x)

# Reshape this to [2, 6]. Note how the elements get laid out.
x_2_6 = tf.reshape(x, [2, -1])
print("x_2_6 = %s\n" % x_2_6)

# Further rearrange x_2_6 to [3, 4]
x_3_4 = tf.reshape(x_2_6, [3, 4])
print("x_3_4 = %s\n" % x_3_4)

# In fact you don't have to specify the full shape. You can leave
# one component of the shape unspecified by setting it to -1.
# This component will then be computed automatically by preserving the
# total size.
x_12_1 = tf.reshape(x_3_4, [-1, 1])
print("x_12_1 = %s\n" % x_12_1)

# What happens when are too many or too few elements?
# You get an error!
#x_wrong = tf.reshape(x_3_4, [4, 5])
#print("x_wrong = %s" % x_12_1)

The next set of examples show how to use the squeeze and expand_dims ops.


In [0]:
#@title Squeezing and Expanding Tensors

import numpy as np

# Create a tensor where the second and fourth dimension is of length 1.
x = tf.constant(np.reshape(np.arange(1, 5), [2, 1, 2, 1]))
print("Shape of x = %s" % x.shape)

# Now squeeze out all the dimensions of length 1
x_squeezed = tf.squeeze(x)
print("\nShape of x_squeezed = %s" % x_squeezed.shape)

# You can control which dimension you squeeze
x_squeeze_partial = tf.squeeze(x,3)
print("\nShape of x_squeeze_partial = %s" % x_squeeze_partial.shape)

# Expand_dims works in reverse to add dimensions of length one.
# Think of this as just adding brackets [] somewhere in the tensor.

y = tf.constant([[1, 2],[3, 4]])
y_2 = tf.expand_dims(y, 2)
y_3 = tf.expand_dims(y_2, 2)
print("\nShape of y = %s" % y.shape)
print("\nShape of y_2 = %s" % y_2.shape)
print("\nShape of y_3 = %s" % y_3.shape)
  • The transpose op deserves a bit of explanation.
  • For matrices, it does the usual transpose operation.
  • For higher rank tensors, it allows you to permute the dimensions by specifying the permutation you want.

Examples will (hopefully) make this clearer.


In [0]:
#@title Transposing tensors

# Create a matrix
x = tf.constant([[1, 2], [3, 4]])

x_t = tf.transpose(x)

print("X:\n%s\n" % x)

print("transpose(X):\n%s\n" % x_t)

# Now try this for a higher rank tensor. 

# Create a tensor of shape [3, 2, 1]
y = tf.constant([[[1],[2]], [[3],[4]], [[5],[6]]])

print("Shape of Y: %s\n" % y.shape)
print("Y:\n%s\n" % y)

# Flip the first two dimensions
y_t12 = tf.transpose(y, [1, 0, 2])

print("Shape of Y with the first two dims flipped: %s\n" % y_t12.shape)
print("transpose(Y, 1 <-> 2):\n%s\n" % y_t12)

Quiz: Create a Grid

We mentioned the tf.meshgrid op above but didn't use it. In this quiz you will
use it to do something we will find useful later on.

Suppose we are given a set of x coordinates, say, [1, 2, 3] and another set of
y coordinates e.g. [1, 2, 3]. We want to create the "grid" formed from these coordinates as shown in the following diagram.

tf.meshgrid allows you to do this but it will produce the X and Y coordinates of
the grid separately. Your task below is to create a tensor of complex numbers such that Z = X + j Y represents points on the grid (e.g. the lower left most point
will have Z = 1 + j while the top right one has Z = 3 + 3j.

You should put your code in the function create_grid and run the cell when you are done. If it works, you will see a plot of the grid that you produced.

Hints:

  • Experiment with tf.meshgrid to get X and Y of the right shape needed for the grid.
  • Join the separate X and Y using tf.complex(x, y)

In [0]:
#@title 

%%html
<div style="text-align:center; font-size: 40px; opacity: 0.9; color:blue"><p>Example Grid</p>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="400px" height="300px" viewBox="0 0 400 300" preserveAspectRatio="xMidYMid meet" ><rect id="svgEditorBackground" x="0" y="0" width="300" height="300" style="fill: none; stroke: none;"/>
  <line x1="106.26899719238736" y1="50.779850959776496" x2="107.26899719238736" y2="236.77943420410023" stroke="black" style="stroke-width: 1px; fill: none;"/>
  <line x1="174.26887512207486" y1="50.779850959776496" x2="174.26887512207486" y2="236.77943420410023" stroke="black" style="stroke-width: 1px; fill: none;"/>
  <line x1="240.26852416992642" y1="50.779850959776496" x2="240.26852416992642" y2="236.77943420410023" stroke="black" style="stroke-width: 1px; fill: none;"/>
  <text fill="black" x="101.269" y="271.779" style="font-family: Arial; font-size: 20px;" >
    <tspan x="100.271" y="256.785" dy="" dx="">1</tspan>
    <tspan x="169.271" dy="0" y="" dx="">2</tspan>
    <tspan x="234.271" dy="0" y="" dx="">3</tspan>
  </text>
  <line x1="62.26904296875455" y1="209.77961730956898" x2="320.26831054687955" y2="209.77961730956898" stroke="black" style="stroke-width: 1px; fill: none;"/>
  <line x1="62.26904296875455" y1="153.77967834472523" x2="320.26831054687955" y2="153.77967834472523" stroke="black" style="stroke-width: 1px; fill: none;"/>
  <line x1="62.26904296875455" y1="99.77981567382679" x2="320.26831054687955" y2="99.77981567382679" stroke="black" style="stroke-width: 1px; fill: none;"/>
  <text fill="black" x="42.269" y="215.78" id="e523_texte" style="font-family: Arial; font-size: 20px;" >1</text>
  <text fill="black" x="42.269" y="156.78" id="e552_texte" style="font-family: Arial; font-size: 20px;" >2</text>
  <text fill="black" x="41.269" y="105.78" id="e564_texte" style="font-family: Arial; font-size: 20px;" >3</text>
  <circle id="e616_circle" cx="105.26899719238736" cy="99.77981567382679" stroke="black" style="stroke-width: 1px;" r="3.25248" fill="khaki"/>
  <circle id="e628_circle" cx="173.26887512207486" cy="99.77981567382679" stroke="black" style="stroke-width: 1px;" r="3.25248" fill="khaki"/>
  <circle id="e640_circle" cx="240.26852416992642" cy="99.77981567382679" stroke="black" style="stroke-width: 1px;" r="3.25248" fill="khaki"/>
  <circle id="e652_circle" cx="240.26852416992642" cy="153.77967834472523" stroke="black" style="stroke-width: 1px;" r="3.25248" fill="khaki"/>
  <circle id="e664_circle" cx="241.26850891113736" cy="208.77961730956898" stroke="black" style="stroke-width: 1px;" r="3.25248" fill="khaki"/>
  <circle id="e676_circle" cx="174.26887512207486" cy="153.77967834472523" stroke="black" style="stroke-width: 1px;" r="3.25248" fill="khaki"/>
  <circle id="e688_circle" cx="106.26899719238736" cy="153.77967834472523" stroke="black" style="stroke-width: 1px;" r="3.25248" fill="khaki"/>
  <circle id="e700_circle" cx="107.26899719238736" cy="208.77961730956898" stroke="black" style="stroke-width: 1px;" r="3.25248" fill="khaki"/>
  <circle id="e712_circle" cx="174.26887512207486" cy="209.77961730956898" stroke="black" style="stroke-width: 1px;" r="3.25248" fill="khaki"/>
  <text fill="black" x="111.269" y="199.78" id="e749_texte" style="font-family: Arial; font-size: 16px;" dy="" dx="" >(1,1)</text>
  <text fill="black" x="174.269" y="201.78" id="e835_texte" style="font-family: Arial; font-size: 16px;" >(2,1)</text>
  <text fill="black" x="107.269" y="90.7798" id="e847_texte" style="font-family: Arial; font-size: 16px;" >(1,3)</text>
  <text fill="black" x="108.269" y="145.78" id="e859_texte" style="font-family: Arial; font-size: 16px;" dy="" dx="" >(1,2)</text>
  <text fill="black" x="174.269" y="145.78" id="e967_texte" style="font-family: Arial; font-size: 16px;" >(2,2)</text>
  <text fill="black" x="175.269" y="92.7798" id="e994_texte" style="font-family: Arial; font-size: 16px;" >(2,3)</text>
  <text fill="black" x="240.269" y="200.78" id="e1021_texte" style="font-family: Arial; font-size: 16px;" >(3,1)</text>
  <text fill="black" x="241.269" y="145.78" id="e1048_texte" style="font-family: Arial; font-size: 16px;" >(3,2)</text>
  <text fill="black" x="241.269" y="92.7798" id="e1075_texte" style="font-family: Arial; font-size: 16px;" >(3,3)</text>
  <text fill="black" x="176.269" y="284.779" id="e1257_texte" style="font-family: Arial; font-size: 20px;" >x</text>
  <text fill="black" x="11.269" y="157.78" id="e1272_texte" style="font-family: Arial; font-size: 20px;" >y</text>
</svg>
</div>

In [0]:
#@title Reshaping Quiz

import numpy as np
import matplotlib.pyplot as plt

def create_grid(x, y):
  """Creates a grid on the complex plane from x and y.

  Given a set of x and y coordinates as rank 1 tensors 
  of sizes n and m respectively, returns a complex tensor 
  of shape [n, m] containing points on the grid formed by
  intersection of horizontal and vertical lines rooted at
  those x and y values.
  
  Args:
    x: A float32 or float64 tensor of shape [n]
    y: A tensor of the same data type as x and shape [m].
    
  Returns:
    A complex tensor with shape [n, m].
  """
  raise NotImplementedError()

coords = tf.constant([1.0, 2.0, 3.0])
square_grid = create_grid(coords, coords)

def test():
  x_p = np.array([1.0, 2.0, 3.0])
  y_p = np.array([5.0, 6.0, 7.0, 8.0])
  grid = create_grid(tf.constant(x_p),  tf.constant(y_p))
  n_p = x_p.size * y_p.size
  x = np.reshape(np.real(grid), [n_p])
  y = np.reshape(np.imag(grid), [n_p])
  plt.plot(x, y, 'ro')
  plt.xlim((x_p.min() - 1.0, x_p.max() + 1.0))
  plt.ylim((y_p.min() - 1.0, y_p.max() + 1.0))
  plt.ylabel('Imaginary')
  plt.xlabel('Real')
  plt.show()
  
test()

In [0]:
#@title Reshaping Quiz - Solution. Double click to reveal

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

def create_grid(x, y):
  """Creates a grid on the complex plane from x and y.

  Given a set of x and y coordinates as rank 1 tensors 
  of sizes n and m respectively, returns a complex tensor 
  of shape [n, m] containing points on the grid formed by
  intersection of horizontal and vertical lines rooted at
  those x and y values.
  
  Args:
    x: A float32 or float64 tensor of shape [n]
    y: A tensor of the same data type as x and shape [m].
    
  Returns:
    A complex tensor with shape [n, m].
  """
  
  X, Y = tf.meshgrid(x, y)
  return tf.complex(X, Y)

coords = tf.constant([1.0, 2.0, 3.0])
square_grid = create_grid(coords, coords)

def test():
  x_p = np.array([1.0, 2.0, 3.0])
  y_p = np.array([5.0, 6.0, 7.0, 8.0])
  grid = create_grid(tf.constant(x_p),  tf.constant(y_p))
  n_p = x_p.size * y_p.size
  x = np.reshape(np.real(grid), [n_p])
  y = np.reshape(np.imag(grid), [n_p])
  plt.plot(x, y, 'ro')
  plt.xlim((x_p.min() - 1.0, x_p.max() + 1.0))
  plt.ylim((y_p.min() - 1.0, y_p.max() + 1.0))
  plt.ylabel('Imaginary')
  plt.xlabel('Real')
  plt.show()
  
test()

Tensors vs. Numpy Arrays

  • Tensors can be created by wrapping numpy arrays (as above) or even python lists.
  • You can use all the Numpy methods to create arrays that can be wrapped as a tensor.
  • Most TensorFlow ops will accept a numpy array directly but they will convert it to a tensor implicitly.
  • For many Numpy methods, there are analogous TensorFlow methods which we will see later.
    • Example: np.zeros $\leftrightarrow$ tf.zeros
  • However, a tensor is not the same as a numpy array.
    • Tensors are more like "pointers" to the data.
    • In eager mode, tensors don't have their values until they are evaluated. Numpy arrays are eagerly evalauted.
    • You can't convert a Tensor back to a numpy array without evaluating the graph. In eager mode you can do so by invoking numpy() method on the tensor.

The following examples clarify this.


In [0]:
# Import TensorFlow and numpy.
import numpy as np

# You can make a tensor out of a python list.
tensor_of_ones = tf.constant([1, 1, 1], dtype=tf.int64)

# You can also use numpy methods to generate the arrays.
tensor_of_twos = tf.constant(np.repeat(2, [3]))

# TensorFlow Ops (tf.add) accept tensors ...
tensor_of_threes = tf.add(tensor_of_ones, tensor_of_twos)

# ... and (sometimes) also the numpy array directly.
tensor_of_threes_1 = tf.add(np.array([1, 1, 1]), 
                            tensor_of_twos)

# You can check that the tensor
print("Type: %s" % type(tensor_of_threes)) # This is not an array! 
                                           # It is a tensor.
# You can check that the tensor
print("Numpy conversion: %s" % tensor_of_threes.numpy())

How does it work?

  • tf.constant creates a tensor from some constant data.
  • In addition to supplying python lists, you may also supply numpy arrays.
  • Much easier to create higher dimensional data with numpy arrays.
  • tf.add is an example of an Op. It takes two tensor arguments and returns a tensor.
  • Tensors have a definite type, e.g. int32, float64, bool etc.
    • By default, TF will use the type of the supplied numpy array.
    • For python lists, TF will try to infer the type but you are better off supplying the type explicitly using "dtype=" arg.

Tensor data types

  • Other than shape, tensors are characterized by the type of data elements it contains.
  • Useful to keep in mind the following commonly used types
    • Integer types: tf.int32, tf.int64
    • Float types: tf.float32, tf.float64
    • Boolean type: tf.bool
    • Complex type: tf.complex64, tf.complex128
  • Many tensor creation ops (e.g. tf.constant, tf.convert_to_tensor etc.) accept an optional dtype argument.
  • TensorFlow does not do automatic type conversions. For example, the following code causes an error.

In [0]:
#@title Strict Types in TensorFlow

int32_tensor = tf.constant([1, 1], dtype=tf.int32)
int64_tensor = tf.constant([2, 2], dtype=tf.int64)
try_mix_types = tf.add(int32_tensor, int64_tensor)  # Causes a TypeError
  • Occasionally, we need to convert one type to another (e.g. int32 -> float32).
  • There are a few explicit conversion ops:
  • If you need conversion to something that isn't listed you can use the more general: tf.cast

In [0]:
#@title Casting from one type to another

# Make sure this is an int32 tensor by explicitly specifying type.
# NB: In this particular case, even if you left out the type, TF
# will infer it as an int32.
int32_tensor = tf.constant([1, 1], dtype=tf.int32)

int64_tensor = tf.constant([2, 2], dtype=tf.int64)

casted_to_64 = tf.cast(int32_tensor, dtype=tf.int64)

# This is OK.
added = tf.add(casted_to_64, int64_tensor)


# As an example of tf.cast, consider casting to boolean
zero_one = tf.constant([1.0, 0.0, 1.0])  # Inferred as tf.float32
print("Type of zero_ones = %s" % repr(zero_one.dtype))

zero_one_bool = tf.cast(zero_one, tf.bool)
print("Type of zero_ones_bool = %s" % repr(zero_one_bool.dtype))

# Another example of cast: Convert real numbers to Complex
real_tensor = tf.constant([1.0, 1.0])
cplx_tensor = tf.cast(real_tensor, tf.complex64)

Creating Tensors

  • We have already seen that tf.constant creates a tensor from supplied data.
  • Some other useful functions are in the table below. Use the Colab auto complete feature to see their usage instructions.

Constant Tensors

Name Description
tf.zeros Creates a constant tensor of zeros of a given shape and type.
tf.zeros_like Creates a constant tensor of zeros of the same shape as the input tensor.
tf.ones Creates a constant tensor of ones of a given shape and type.
tf.ones_like Creates a constant tensor of ones of the same shape as the input tensor.
tf.linspace Creates an evenly spaced tensor of values between supplied end points.

The following example demonstrates some of these ops.


In [0]:
#@title Creating Constant Tensors without numpy

# Create a bunch of zeros of a specific shape and type.
x = tf.zeros([2, 2], dtype=tf.float64)
print("tf.zeros example: %s" % x)

# tf.zeros_like is pretty useful. It creates a zero tensors which is 
# shaped like some other tensor you supply.
x = tf.constant([[[1], [2]]])
zeros_like_x = tf.zeros_like(x, dtype=tf.float32)

print("Shape(x) = %s \nShape(zeros_like_x) = %s" % 
      (x.shape, zeros_like_x.shape))

Random Tensors

A common need is to create tensors with specific shape but with randomly distributed entries. TF provides a
few methods for these.

Name Description
tf.random.normal Generates a constant tensor with independent normal entries.
tf.random.uniform Generates a constant tensor with uniformly distributed elements.
tf.random.gamma Generates a constant tensor with gamma distributed elements.
tf.random.shuffle Takes an input tensor and randomly permutes the entries along the first dimension.

Let us see some of these in action.


In [0]:
#@title Creating Random Tensors

# Create a matrix with normally distributed entries.
x = tf.random.normal([1, 3], mean=1.0, stddev=4.0, dtype=tf.float64)
print("A random normal tensor: %s" % x)

# Randomly shuffle the first dimension of a tensor.
r = tf.random.shuffle([1, 2, 3, 4])
print("Random shuffle of [1,2,3,4]: %s" % r)

Sessions and tf.function

You will need to restart your runtime in order to run examples in this section. We have discussed above that TensorFlow has an executor accessible through tf.compat.v1.Session. But what does it do? Assume for a moment that eager execution is disabled (tf.compat.v1.disable_eager_execution())

  • When you write TensorFlow ops or tensors, you are adding them to the "graph".
  • It does not immediately evaluate anything. It only performs some sanity checks
    on your ops.
  • Recall: a tensor itself is not the value. It is a container for the data that will be
    generated when it is evaluated.
  • After creating the graph you have to explicitly ask for one or more of the tensors
    to be evaluated.
  • Let's see this in action:
  • The argument you supply to eval is called a Session.
  • The session is an object that creates/controls/talks to the C++ runtime that will
    actually run your computation.
  • The client (i.e. your python session) transfers the graph information to the session
    to be evaluated.
  • The session evaluates the relevant part of your graph and returns the value to
    your client for you to enjoy.

In [0]:
import tensorflow as tf
tf.compat.v1.disable_eager_execution()

#@title Evaluating Tensors

x = tf.constant([1., 1.])

# Check that x is not actually a list.
print("Type of 'x': %s" % type(x)) # It is a tensor

# Evaluate the tensor to actually make TF to do the computation.
with tf.compat.v1.Session() as sess:
  x_values = sess.run(x)
print("Value of 'x_values': %s\nType of x_values: %s" % (x_values,
                                                         type(x_values)))
  • When you eval a tensor, you are telling TF that you want it to go ahead and run the computation needed to get a value for that tensor. At that point, TF figures out what other operations and tensors it needs to evaluate to be able to give you what you want.
    • This extra step may seem annoying but it is (part of) what makes TF powerful. It allows TF to evaluate only those ops that are directly needed for the output.
  • In the usage above, it is rather inconvenient that we can only evaluate one tensor at a time. There are two way to avoid this.
  • Create a session variable and hold on to it for use with multiple evals.
    The following example demonstrates this:

In [0]:
#@title Using Sessions

import numpy as np
import matplotlib.pyplot as plt

tf.compat.v1.reset_default_graph()

# Create a 1-tensor with uniform entries between -pi and pi.
x = tf.linspace(-np.pi, np.pi, 20)

y = tf.sin(x) + tf.random.uniform([20], minval=-0.5, maxval=0.5)

# Create session object which we will use multiple times.
sess = tf.compat.v1.Session()

plt.plot(sess.run(x), sess.run(y), 'ro')

# A session is a resource (like a file) and you must close it when
# you are done.
sess.close()
  • In the above method, it is still inconvenient to have to call eval on each tensor separately.
  • This can be avoided by using the method "run" on sessions as follows

In [0]:
# Continuation of the above example so you must run that first.

sess = tf.compat.v1.Session()

# Session.run evaluates one or more tensors supplied as a list
# or a tuple. It returns their values as numpy arrays which you
# may capture by assigning to a variable.
x_v, y_v = sess.run((x, y))

plt.plot(x_v, y_v, 'ro')

sess.close()
  • It is pretty easy to forget to close the sessions so the best idea is to use them as context managers. This is the most common way of using sessions.

In [0]:
#@title Sessions as Context managers

import matplotlib.pyplot as plt

x = tf.linspace(-5.0, 5.0, 1000)

y = tf.nn.sigmoid(x)

with tf.compat.v1.Session() as sess:
  x_v, y_v = sess.run([x, y])
  plt.plot(x_v, y_v)

Normally one would develop in eager mode. If the calculations are complex, graph optimization might provide enormous benefits. In this case, tf.function should be used.


In [0]:
# Please reset your runtime to ensure eager execution
import tensorflow as tf
print('Eager execution enabled: ', tf.executing_eagerly())

def eager_execution(a, b):
  a = tf.convert_to_tensor(a, name="a")
  b = tf.convert_to_tensor(b, name="b")
  c = a + b
  print("Tensor `c` is evaluated inside the function call: ", c)
  d = c**2
  return d

@tf.function
def graph_execution(a, b):
  a = tf.convert_to_tensor(a, name="a")
  b = tf.convert_to_tensor(b, name="b")
  c = a + b
  print("Evaluation of `c` is deferred: ", c)
  d = c**2
  return d

print("Eager execution: ", eager_execution(1.0, 2.0))
print("-------------")
print("Graph execution: ", graph_execution(1.0, 2.0))

Example: Random Matrices

Please restart your runtime if you are in the graph mode (check if tf.executing_eagerly() is True). Let us put together a few of the ops we have seen so far (and a few we haven't) into a longer example.

A random matrix is a matrix whose entries are (usually independently) randomly distributed drawn from some chosen distribution.

In this example, we will approximate the distribution of the determinant of a random $n \times n$ matrix.

The steps we will follow are:

  • Generate a sample of matrices of a desired size.
  • Compute their determinant.
  • Plot the histogram.

In [0]:
import tensorflow as tf
import matplotlib.pyplot as plt
print('Eager execution enabled: ', tf.executing_eagerly())
# Dimension of matrix to generate.
n = 10

# Number of samples to generate.
sample_size = 100000

# Using suggestion above let us build a graph which we wrap in a `tf.function`.

def random_matrix_generator():
  # We will generate matrices with elements uniformly drawn from (-1, 1).
  # TensorFlow provides a whole bunch of methods to generate random tensors
  # of a given shape and here we will use the random_uniform method.
  samples = tf.random.uniform(shape=[sample_size, n, n], minval=-1, maxval=1)

  # There is also an Op to generate matrix determinant. It requires that you pass
  # it a tensor of shape [...., N, N]. This ensures that the last two dimensions
  # can be interpreted as a matrix.
  # Can you guess what the shape of the resulting determinants is?
  dets_sample = tf.linalg.det(samples)

  print('Determinant shape: ', dets_sample.shape)

  # While we are at it, we might as well compute some summary stats.
  dets_mean = tf.reduce_mean(dets_sample)
  dets_var = tf.reduce_mean(tf.square(dets_sample)) - tf.square(dets_mean)
  return dets_sample, dets_mean, dets_var

# Wrap computations in `tf.function`
random_matrix_generator_optimized = tf.function(random_matrix_generator)
# Evaluate the determinants and plot a histogram.
det_vals, mean, var = random_matrix_generator_optimized()
# Plot a beautiful histogram.
plt.hist(det_vals, 50, normed=1, facecolor='green', alpha=0.75)
plt.xlabel('Det(Unif(%d))' % n)
plt.ylabel('Probability')
plt.title(r'$\mathrm{Random\ Matrix\ Determinant\ Distribution:}\ \mu = %f,\ \sigma^2=%f$' % (mean, var))
plt.grid(True)
plt.show()

In this example, we used some ops such as tf.reduce_mean and tf.square which we will discuss more later.

Maths Ops

  • There is a whole suite of commonly needed math ops built in.
  • We have already seen binary ops such as tf.add (addition) and tf.mul (multiplication).
  • Five can also be accessed as inline operators on tensors.
  • The inline form of the op allows you to e.g. write x + y instead of tf.add(x, y).
Name Description Inline form
tf.add Adds two tensors element wise +
tf.subtract Subtracts two tensors element wise -
tf.multiply Multiplies two tensors element wise *
tf.divide Divides two tensors element wise /
tf.mod Computes the remainder of division element wise %
  • Note that the behaviour of "/" and "//" varies depending on python version and presence of from __future__ import division, to match how division behaves with ordinary python scalars.
  • The following table lists some more commonly needed functions:
Name Description
tf.exp The exponential of the argument element wise.
tf.log The natural log element wise
tf.sqrt Square root element wise
tf.round Rounds to the nearest integer element wise
tf.maximum Maximum of two tensors element wise.

Matrix Ops

  • Matrices are rank 2 tensors. There is a suite of ops for doing matrix manipulations which we briefly discuss.
Name Description
tf.matrix_diag Creates a tensor from its diagonal
tf.trace Computes the sum of the diagonal elements of a matrix.
tf.matrix_determinant Computes the determinant of a matrix (square only)
tf.matmul Multiplies two matrices
tf.matrix_inverse Computes the inverse of the matrix (square only)

Quiz: Normal Density

  • In the following mini-codelab, you are asked to compute the normal density using the ops you have seen so far.
  • You first generate a sample of points at which you will evaluate the density.
  • The points are generated using a normal distribution (need not be the same one whose density you are evaluating).
  • This is done by the function generate_normal_draws below.
  • The function normal_density_at computes the density at any given set of points.
  • You have to complete the code of these two functions so they work as expected.
  • Execute the code and check that the test passes.

Hints

  • Recall that the normal density is given by
    $f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$
  • Here $\mu$ is the mean of the distribution and $\sigma > 0$ is the standard deviation.
  • Pay attention to the data types mentioned in the function documentations. You should ensure that your implementations respect the data types stated.

In [0]:
#@title Mini-codelab: Compute the normal density.

import numpy as np
import numpy.testing as npt
from scipy import stats

def generate_normal_draws(shape, mean=0.0, stddev=1.0):
  """Generates a tensor drawn from a 1D normal distribution.
  
    Creates a constant tensor of the supplied shape whose elements are drawn
    independently from a normal distribution with the supplied parameters.

    Args:
      shape: An int32 tensor. Specifies the shape of the return value.
      mean: A float32 value. The mean of the normal distribution. 
      stddev: A positive float32 value. The standard deviation of the 
        distribution.

    Returns:
      A constant float32 tensor whose elements are normally distributed.
  """
  
  # to-do: Complete this function.
  pass

def normal_density_at(x, mean=0.0, stddev=1.0):
  """Computes the normal density at the supplied points.
  
    Args:
      x: A float32 tensor at which the density is to be computed.
      mean: A float32. The mean of the distribution.
      stddev: A positive float32. The standard deviation of the distribution.
      
    Returns:
      A float32 tensor of the normal density evaluated at the supplied points.
  """
  
  # to-do: Complete this function. As a reminder, the normal density is
  # f(x) = exp(-(x-mu)^2/(2*stddev^2)) / sqrt(2 pi stddev^2).
  # The value of pi can be accessed as np.pi.
  pass

def test():
  mu, sd = 1.1, 2.1
  x = generate_normal_draws([2, 3, 5], mean=mu, stddev=sd)
  pdf = normal_density_at(x)
  npt.assert_array_equal(x.shape, [2,3,5], 'Shape is incorrect')
  norm = stats.norm()
  npt.assert_allclose(pdf, norm.pdf(x), atol=1e-6)
  print ("All good!")
    
test()

In [0]:
#@title Mini-codelab Solution: Compute the normal density. Double-click to reveal

import numpy as np
import numpy.testing as npt
from scipy import stats

def generate_normal_draws(shape, mean=0.0, stddev=1.0):
  """Generates a tensor drawn from a 1D normal distribution.
  
    Creates a constant tensor of the supplied shape whose elements are drawn
    independently from a normal distribution with the supplied parameters.

    Args:
      shape: An int32 tensor. Specifies the shape of the return value.
      mean: A float32 value. The mean of the normal distribution. 
      stddev: A positive float32 value. The standard deviation of the 
        distribution.

    Returns:
      A constant float32 tensor whose elements are normally distributed.
  """
  return tf.random.normal(shape=shape, mean=mean, stddev=stddev)

def normal_density_at(x, mean=0.0, stddev=1.0):
  """Computes the normal density at the supplied points.

    Args:
      x: A float32 tensor at which the density is to be computed.
      mean: A float32. The mean of the distribution.
      stddev: A positive float32. The standard deviation of the distribution.

    Returns:
      A float32 tensor of the normal density evaluated at the supplied points.
  """
  normalization = 1.0 / np.sqrt(2.0 * np.pi * stddev * stddev)
  return tf.exp(-tf.square((x - mean) / stddev) / 2.0) * normalization

def test():
  mu, sd = 1.1, 2.1
  x = generate_normal_draws([2, 3, 5], mean=mu, stddev=sd)
  pdf = normal_density_at(x)
  npt.assert_array_equal(x.shape, [2,3,5], 'Shape is incorrect')
  norm = stats.norm()
  npt.assert_allclose(pdf, norm.pdf(x), atol=1e-6)
  print ("All good!")
    
test()

Logical And Comparison Ops

  • Tensorflow has the full complement of logical operators you would expect.
  • These are also overloaded so you can use their inline version.
  • The ops most frequently used are as follows:
Name Description Inline form
tf.equal Element wise equality None
tf.less Element wise less than <
tf.less_equal Element wise less than or equal to <=
tf.greater Element wise greater than >
tf.greater_equal Element wise greater than or equal to >=
tf.logical_and Element wise And &
tf.logical_or Element wise Or |
  • Note that tf.equal doesn't have an inline form. Comparing two tensors with == will use the default python comparison. It will not call tf.equal.

Note about Broadcasting

  • All the binary operators described above expect their operands to be of the same shape up to broadcasting.
  • Broadcasting attempts to find a larger shape that would render the two arguments compatible.
    • Tensorflow's broadcasting behaviour is like Numpy's.
    • Example: [ 1, 2, 3 ] > 0. The LHS is tensor of shape [3] while the right hand side can be promoted to [ 0, 0, 0] which makes it compatible.
    • More non trivial example: tf.equal([[1,2], [2, 3]], [2,3]). The LHS is shape [2,2] right is shape [2]. The RHS gets broadcasted so that it looks like [[2,3],[2,3]] and the comparison is performed element wise.
    • These are the most common case and we will make extensive use of this below.
    • The full set of rules for broadcasting are available here.

In [0]:
#@title Comparison Ops Examples

a = tf.constant([1.0, 1.0])
b = tf.constant([2.0, 2.0])
c = tf.constant([1.0, 2.0])
d = 3.0

print ("Inputs:\na: %s\nb: %s\nc: %s\nd: %s\n" % (a, b, c, d))
# Less-than op. Tests if the first argument is less than the second argument
# component wise.
a_less_than_b = a < b
b_greater_than_c = b > c

# Simple broadcasting in action
a_less_than_d = a < d

# More complex broadcasting
a2 = tf.constant([[1,2],[2,3]])
b2 = tf.constant([1,3])
c2 = tf.equal(a2, b2)

# Note that there is no inline form for tf.equals. If you do b == c, you will
# not get what you think you might.
b_equal_to_c = tf.equal(b, c)

outputs = (a_less_than_b, b_greater_than_c, a_less_than_d,  b_equal_to_c)

print("Outputs:\na < b: %s\nb > c: %s\na < d: %s\nb == c: %s\n" % (outputs))


print("Complex Broadcasting")
print("%s == %s => %s" % (a2.numpy(), b2.numpy(), c2.numpy()))

Aggregations and Scans

Most of the ops we have seen so far, act on the input tensors in an element wise manner. Another important set of operators allow you to do aggregations on a whole tensor as well as scan the tensor.

  • Aggregations (or reductions) act on a tensor and produce a reduced dimension tensor. The main ops here are
Name Description
tf.reduce_sum Sum of elements along all or some dimensions.
tf.reduce_mean Average of elements along all or some dimensions.
tf.reduce_min Minimum of elements along all or some dimensions.
tf.reduce_max Maximum of elements along all or some dimensions.
  • and for boolean tensors only
Name Description
tf.reduce_any Result of logical OR along all or some dimensions.
tf.reduce_all Result of logical AND along all or some dimensions.
  • Scan act on a tensor and produce a tensor of the same dimension.
Name Description
tf.cumsum Cumulative sum of elements along an axis.
tf.cumprod Cumulative product of elements along an axis.

Codelab: Estimating $\pi$

In this short codelab, we will use an age old method to estimate the value of $\pi$. The idea is very simple: Throw darts at a square and check what fraction lies inside the inscribed circle (see diagram).


In [0]:
#@title
%%html
<svg width="210" height="210">
  <rect x1="0" y1="0" width="200" height="200" stroke="blue" fill="red"
       fill-opacity="0.5" stroke-opacity="0.8"/>
  <circle cx="100" cy="100" r="99" fill="green" stroke="rgba(0,20,0,0.7)" stroke-width="2"/>
  
  <circle cx="188" cy="49" r="3" fill="blue" stroke="rgba(0,20,0,0.7)" stroke-width="2"/>
  <circle cx="113" cy="130" r="3" fill="blue" stroke="rgba(0,20,0,0.7)" stroke-width="2"/>
  <circle cx="44" cy="78" r="3" fill="blue" stroke="rgba(0,20,0,0.7)" stroke-width="2"/>
  <circle cx="116" cy="131" r="3" fill="blue" stroke="rgba(0,20,0,0.7)" stroke-width="2"/>
  <circle cx="189" cy="188" r="3" fill="blue" stroke="rgba(0,20,0,0.7)" stroke-width="2"/>
  <circle cx="126" cy="98" r="3" fill="blue" stroke="rgba(0,20,0,0.7)" stroke-width="2"/>
  <circle cx="18" cy="42" r="3" fill="blue" stroke="rgba(0,20,0,0.7)" stroke-width="2"/>
  <circle cx="146" cy="62" r="3" fill="blue" stroke="rgba(0,20,0,0.7)" stroke-width="2"/>
  <circle cx="13" cy="139" r="3" fill="blue" stroke="rgba(0,20,0,0.7)" stroke-width="2"/>
  <circle cx="157" cy="94" r="3" fill="blue" stroke="rgba(0,20,0,0.7)" stroke-width="2"/>

  <line x1="100" y1="100" x2="170" y2="170" stroke="black"/>
  <text x="144" y="130" text-anchor="middle"><tspan baseline-shift="sub" font-size="normal">1</tspan></text>
</svg>

The steps to estimate it are:

  • Generate $n$ samples of pairs of uniform variates $(x, y)$ drawn from $[-1, 1]$.
  • Compute the fraction $f_n$ that lie inside the unit circle, i.e. have $x^2+y^2 \leq 1$.
  • Estimate $\pi \approx = 4 f_n$, because
    • The area of the unit circle is $\pi$
    • The area of the rectangle is $4$
    • $f_{\infty} = \frac{\pi}{4}$.

Your task is to complete the functions generate_sample and compute_fraction below. They correspond to the first and the second steps described above.
The last step is already done for you in the function estimate_pi.


In [0]:
#@title Codelab: Estimating Pi

import numpy as np


def generate_sample(size):
  """Sample a tensor from the uniform distribution.
  
  Creates a tensor of shape [size, 2] containing independent uniformly
  distributed numbers drawn between [-1.0, 1.0].
  
  Args:
    size: A positive integer. The number of samples to generate.
  
  Returns:
    A tensor of data type tf.float64 and shape [size, 2].
  """
  raise NotImplementedError()


def compute_fraction(sample):
  """The fraction of points inside the unit circle.
  
  Computes the fraction of points that satisfy
  sample[0]^2 + sample[1]^2 <= 1.
  
  Args:
    sample: A float tensor of shape [n, 2].
    
  Returns:
    The fraction of n that lie inside the unit circle.
  """
  raise NotImplementedError()

def estimate_pi(num_samples):
  sample = generate_sample(num_samples)
  f_t = compute_fraction(sample)
  error = np.abs(np.pi / 4 - f_t) / (np.pi / 4)
  print ("Estimate: %.5f, Error: %.3f%%" % (4 * f_t, error * 100.))

estimate_pi(100000)

In [0]:
#@title Codelab Solution: Estimating Pi. Double click to reveal

import tensorflow as tf
import numpy as np

def generate_sample(size):
  """Sample a tensor from the uniform distribution.
  
  Creates a tensor of shape [size, 2] containing independent uniformly
  distributed numbers drawn between [-1.0, 1.0].
  
  Args:
    size: A positive integer. The number of samples to generate.
  
  Returns:
    A tensor of data type tf.float64 and shape [size, 2].
  """
  
  return tf.random.uniform(shape=[size, 2], minval=-1.0, maxval=1.0, 
                           dtype=tf.float64)  


def compute_fraction(sample):
  """The fraction of points inside the unit circle.
  
  Computes the fraction of points that satisfy
  sample[0]^2 + sample[1]^2 <= 1.
  
  Args:
    sample: A float tensor of shape [n, 2].
    
  Returns:
    The fraction of n that lie inside the unit circle.
  """

  sq_distance = tf.reduce_sum(tf.square(sample), 1)
  in_circle = tf.cast(sq_distance <= 1.0, dtype=sq_distance.dtype)
  return tf.reduce_mean(in_circle)


def estimate_pi(num_samples):
  sample = generate_sample(num_samples)
  f_t = compute_fraction(sample)
  error = np.abs(np.pi / 4 - f_t) / (np.pi / 4)
  print ("Estimate: %.5f, Error: %.3f%%" % (4 * f_t, error * 100.))

estimate_pi(100000)

In [0]:
#@title Aggregation/Scan examples

# Generate a tensor of gamma-distributed values and aggregate them
x = tf.random.gamma([100, 10], 0.5)

# Adds all the elements
x_sum = tf.reduce_sum(x)

# Adds along the first axis.
x_sum_0 = tf.reduce_sum(x, 0)

# Maximum along the first axis
x_max_0 = tf.reduce_max(x, 0)

# Cumulative sum for x_max_0:
x_max_cumsum = tf.cumsum(x_max_0)

print("Total Sum: %s\n" % x_sum)
print("Partial Sum: %s\n" % x_sum_0)
print("Maximum: %s\n" % x_max_0)
print("Cumulative sum of x_max_0: %s\n" % x_max_cumsum)

Mixing and Locating Elements

We often need to be able to mix two tensors based on the values in another tensor. The where_v2 op is particularly useful in this context. It has two major uses:

  • tf.where_v2(Condition, T, F): Allows you to mix and match elements of two tensors based on a boolean tensor.

    • All tensors must be of the same shape (or broadcastable to same).
    • T and F must have the same data type.
    • Picks elements of T where Condition is true and F where Condition is False.
    • Example: tf.where_v2([True, False], [1, 2], [3, 4]) $\rightarrow$ [1, 4].
  • tf.where_v2(tensor): Alternatively, if T and F aren't supplied, then the op returns locations of elements which are true.

    • Example: tf.where_v2([1, 2, 3, 4] > 2) $\rightarrow$ [[2], [3]]

Example:

Let's see them in action. We will create tensor of integers between 1 and 50 and set all multiples of 3 to 0.


In [0]:
import numpy as np

# Create a tensor with numbers between 1 and 50
nums = tf.constant(np.arange(1, 51))

# tf.mod(x, y) gives the remainder of the division x / y. 
# Find all multiples of 3.
to_replace = tf.equal(nums % 3, 0)

# First form of where_v2: if to_replace is true, tf.where_v2 picks the element 
# from the first tensor and otherwise, from the second tensor.
result = tf.compat.v2.where(to_replace, tf.zeros_like(nums), nums)
print(result)

In [0]:
# Now let's confirm that we did indeed set the right numbers to zero. 
# This is where the second form of tf.where_v2 helps us. It will find all the 
# indices where its first argument is true.
# Keep in mind that tensors are zero indexed (i.e. the first element has 
# index 0) so we will need to add a 1 to the result.
zero_locations = tf.compat.v2.where(tf.equal(result, 0)) + 1

print(tf.transpose(zero_locations))

Slicing and Joining

There are a number of ops which allow you to take parts of a tensor as well join multiple tensors together.

Before we discuss those ops, let's look at how we can use the usual array indexing to access parts of a tensor.

Indexing

  • Even though tensors are not arrays in the usual sense, you can still index into them.

  • The indexing produces tensors which may be evaluated or consumed further in the usual way.

  • Indexing is a short cut for writing a explicit op (just like you can write x + y instead of tf.add(x, y)).

  • Tensorflow's indexing works similarly to Numpy's.


In [0]:
#@title Indexing

x = tf.constant([[1, 2, 3], [4, 5, 6]])

# Get a tensor containing only the first component of x.
x_0 = x[0]

# A tensor of the first two elements of the first row.
x_0_12 = x[0, 0:2] 

print("x_0: %s" % x_0)
print("x_0_12: %s" % x_0_12)
  
# You can also do this more generally with the tf.slice op which is useful
# if the indices you want are themselves tensors.

x_slice = tf.slice(x, [0, 0], [1, 2])

print("With tf.slice: %s" % x_slice)

Coming back to the ops that are available for tailoring tensors, here are a few of them

Name Description
tf.slice Take a contiguous slice out of a tensor.
tf.split Split a tensor into equal pieces along a dimension
tf.tile Tile a tensor by copying and concatenating it
tf.pad Pads a tensor
tf.concat Concatenate tensors along a dimension
tf.stack Stacks n tensors of rank R into one tensor of rank R+1

Let's briefly look at these ops in action.


In [0]:
#@title Slicing and Joining Examples

x = tf.constant([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Slice takes a starting index somewhere in the tensor and a size in each
# dimension that you want to keep. It allows you to pass tensors for the start
# position and the sizes. Note that the shape of the result is same as size arg.
start_index = tf.constant([1, 1])
size = tf.constant([1, 2])
x_slice = tf.slice(x, start_index, size)
print("tf.slice")
print("x[1:2, 1:3] = %s" % x_slice)


# Split splits the tensor along any given dimension. The return value is a list
# of tensors (and not just one tensor).
pieces = tf.split(x, 3, 0)

print("\ntf.split")
print(pieces)
  
# Tile makes a bigger tensor out of your tensor by tiling copies of it in the
# dimensions you specify. 

y = tf.constant([[1, 2], [3, 4]])

tiled = tf.tile(y, [2, 2])

print("\n tf.tile")
print("Y:\n%s\n" % y)
print("Y tiled twice in both dims:\n%s\n" % tiled)
  
# Pad has a few modes of operation but the simplest one is where you pad a 
# tensor with zeros (the default mode). You specify the amount of padding you
# want at the top and at the bottom of each dimension. In this example, we will
# pad y defined above with zero asymmetrically

padded = tf.pad(y, paddings=[[1, 2], [3, 4]])
print("\n tf.pad")
print("Y with padding:\n%s\n" % padded)

# Concat simply concatenates two tensors of the same rank along some axis.
x = tf.constant([[1], [2]])
y = tf.constant([[3], [4]])
x_y = tf.concat([x, y], 0)
print("\n tf.concat")
print("Concat X and Y:\n%s\n" % x_y)

# Pack is quite useful when you have a bunch of tensors and you want to join
# them into a higher rank tensor. Let's take the same x and y as above.
stacked = tf.stack([x, y], axis=0)
print("\n tf.stacked")
print("Stacked X and Y:\n%s\n" % stacked)
print("Shape X: %s, Shape Y: %s, Shape of Stacked_0: %s" % 
      (x.shape, y.shape, stacked.shape))

Codelab: Distribution of Bernoulli Random Matrices

It's time to flex those tensorflow muscles. Using the ops we have seen so far, let us reconsider the distribution of the
determinant. As it happens, mathematicians focus a lot more on random matrices whose entries are either -1 or 1.
They worry about questions regarding the singularity of a random Bernoulli matrix.

In this exercise, you are asked to generate random matrices whose entries are either +1 or -1
with probability p for +1 (and 1-p for -1). The function bernoulli_matrix_sample needs to return a tensor of such
matrices.

Once that is done, you can run the rest of the code to see the plot of the empirical distribution for the determinant.


In [0]:
#@title Imports And Setup: Run Me First!

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

import seaborn as sns
sns.set(color_codes=True)

def plot_det_distribution(sample_tensor, p=0.5):
  """Plots the distribution of the determinant of the supplied tensor.
  
  Computes the determinant of the supplied sample of matrices and plots its
  histogram.
  
  Args:
    sample_tensor: A tensor of shape [sample_size, n, n].
    p: The probability of generating a +1. Used only for display.
  
  Returns:
    The mean and the variance of the determinant sample as a tuple.
  """

  dets_sample = tf.linalg.det(sample_tensor)
  dets_uniq, _, counts = tf.unique_with_counts(dets_sample)
  dets_mean = tf.reduce_mean(dets_sample)
  dets_var = tf.reduce_mean(tf.square(dets_sample)) - tf.square(dets_mean)
  
  # Get values from the eager tensors
  det_vals, count_vals, mean, var = (dets_uniq.numpy(),
                                     counts.numpy(),
                                     dets_mean.numpy(),
                                     dets_var.numpy())
  num_bins = min(len(det_vals), 50)
  
  plt.hist(det_vals, num_bins, weights=count_vals, normed=1, facecolor='green', 
           alpha=0.75)
  plt.xlabel('Det(Bern(p=%.2g))' % p)
  plt.ylabel('Probability')
  plt.title(r'$\mathrm{Determinant\ Distribution:}\ \mu = %.2g,\ \sigma^2=%.2g$'
            % (mean, var))
  plt.grid(True)
  plt.show()
  return mean, var

In [0]:
#@title Codelab: Bernoulli Matrix Distribution

# NB: Run the Setup and Imports above first.

def bernoulli_matrix_sample(n, size, p=0.5):
  """Generate a sample of matrices with entries +1 or -1.
  
  Generates matrices whose elements are independently drawn from {-1, 1}
  with probability {1-p, p} respectively.
  
  Args:
    n: The dimension of the (square) matrix to generate. An integer.
    size: The number of samples to generate.
    p: The probability of drawing +1.

  Returns:
    A tf.Tensor object of shape [size, n, n] and data type float64.
  """
  # Tensorflow provides a number of distributions to generate random tensors.
  # This includes uniform, normal and gamma. The Tensorflow API docs are an
  # excellent reference for this and many other topics.
  # https://www.tensorflow.org/api_docs/python/tf/random
  # 
  # Unfortunately, however, there is no bernoulli sampler in base tensorflow.
  # There is one in one of the libraries but we will discuss that later.
  # For now, you need to use a uniform sampler to generate the desired sample.
  gen_shape = [size, n, n]
  draws = tf.random.uniform(shape=gen_shape, dtype=tf.float64)
  ones = tf.ones(shape=gen_shape)
  raise NotImplementedError()
  

prob_1 = 0.5
sample = bernoulli_matrix_sample(5, 1000000, prob_1)
plot_det_distribution(sample, prob_1)

In [0]:
#@title Codelab Solution: Bernoulli Matrix Distribution - Double click to reveal
# NB: Run the Setup and Imports above first.

@tf.function
def bernoulli_matrix_sample(n, size, p=0.5):
  """Generate a sample of matrices with entries +1 or -1.
  
  Generates matrices whose elements are independently drawn from {-1, 1}
  with probability {1-p, p} respectively.
  
  Args:
    n: The dimension of the (square) matrix to generate. An integer.
    size: The number of samples to generate.
    p: The probability of drawing +1.

  Returns:
    A tf.Tensor object of shape [size, n, n] and data type float64.
  """

  # Tensorflow provides a number of distributions to generate random tensors.
  # This includes uniform, normal and gamma. The Tensorflow API docs are an
  # excellent reference for this and many other topics.
  # https://www.tensorflow.org/api_docs/python/tf/random
  # 
  # Unfortunately, however, there is no bernoulli sampler in base tensorflow.
  # There is one in one of the libraries but we will discuss that later.
  # For now, you need to use a uniform sampler to generate the desired sample.

  gen_shape = [size, n, n]
  ones = tf.ones(shape=gen_shape)
  draws = tf.random.uniform(shape=gen_shape, dtype=tf.float64)
  return tf.where(draws <= p, ones, -ones)

prob_1 = 0.5
sample = bernoulli_matrix_sample(5, 1000000, prob_1)
plot_det_distribution(sample, prob_1)

Control Flow

tf.cond

The guide below is relevant only for the deferred execution mode (i.e., either with disabled eager execution or inside tf.function). For the sake of determinism assume that the eager mode is disabled.

tf.compat.v1.disable_eager_execution()

In Python (like in most imperative languages), we have the if-else construct which allows us to do different things based on the value of some variable.

The equivalent construct in Tensorflow is the tf.cond op. Consider the following (very contrived) example:


In [0]:
import tensorflow as tf
tf.compat.v1.disable_eager_execution()

# Create a vector of 10 iid normal variates.
x = tf.random.normal([10], name="x")

# If the average of the absolute values of x is greater than 1, we 
# return a tensor of 0's otherwise a tensor of 1's
# Note that the predicate must return a boolean scalar.
w = tf.cond(tf.reduce_mean(tf.abs(x)) > 1.0,
            lambda: tf.zeros_like(x, name="Zeros"),
            lambda: tf.ones_like(x, name="Ones"), name="w")

w.eval(session=tf.compat.v1.Session())

Some things to note here:

  • The predicate must be a scalar tensor (or a value convertible to a scalar tensor).
  • The two branches are provided as a Python function taking no arguments and returning one or more tensors
  • Both branches must return the same number and type of tensors.
  • The evaluation model is lazy. The branch not taken is not evaluated.

Inputting Data

So far we have used data that we generated on the fly. Real world problems typically come with external data sources.

If the data set is of small to medium size, we can load it into the python session using the usual file APIs.

If we are using a Tensorflow pipeline to process this data, we need to feed this data in somehow.

Tensorflow provides a couple of mechanisms to do this.

The simplest way is through the feeding mechanism which we consider first.

Feed Mechanism

We have seen that Tensorflow computation is basically graph evaluation. Tensorflow allows you to "cut" the graph at some edge and replace the tensor on that edge with some value that you can "feed".

This can be done with any tensor, whether they are constants or variables. You do this by passing an override value for that tensor when doing Session.run() through an argument called "feed_dict".

Let's consider an example


In [0]:
import scipy as sp

tf.compat.v1.reset_default_graph()
# Build a simple graph.
x = tf.constant(4.0)

# y = √x
y = tf.sqrt(x)

# z = x^2
z = tf.square(x)

# w = √x + x^2
w = y + z


with tf.compat.v1.Session() as sess:
  print("W by default: %s\n" % sess.run(w))

  # By default y should evaluate to sqrt(4) = 2. 
  # We cut that part of the graph and set y to 10.
  print("(W|y=10) = %s" % sess.run(w, feed_dict={y: 10.0}))
  
  # You can also replace z at the same time.
  print("(W|y=10, z=1) = %s" % sess.run(w, feed_dict={y: 10.0, z: 1.0}))
  
  # At this point, you can generate the values to be fed in any manner
  # you like, including calling functions.
  print("(W|y=random,z=1) = %s" % sess.run(w, feed_dict={y: sp.rand(), z: 1.0}))
  
  # What you cannot do, however, is supply a value which would be inconsistent
  # with the expected shape or type of the original tensor. This is true even
  # if you stay consistent with the relevant bit of the graph.
  # In this (non-)example, we attempt to replace both y and z with a vector
  # and Tensorflow doesn't like that.
  
  #print("(W|y=[random],z=[1])=%s" % sess.run(w,feed_dict={y: [0.0], z: [1.0]}))
  • So we see that while we can replace the value of any tensor, we cannot change the shape or the type of the tensor.
  • The feed value must be concrete object and not a tensor. So, python lists, numpy arrays or scalars are OK.

Placeholders

Placeholders are relevant only in deferred execution. When using tf.function, the inputs to the function become the placeholders.

  • The feed mechanism is a convenient, if somewhat ad hoc way to input data.
  • While you can replace anything, it is not usually a good idea to replace arbitrary tensors except for debugging.
  • Tensorflow provides tf.compat.v1.placeholder objects whose only job is to be fed data.
  • They can be bound to data only at run time.
  • They are defined by their shape and data type. At run time they expect to be fed a concrete object of that shape and type.
  • It is an error to not supply a required placeholder (though there is a way to specify defaults).

Let us see them in action:


In [0]:
import scipy as sp

# Define a placeholder. You need to define its type and shape and these will be
# enforced when you supply the data.
x = tf.compat.v1.placeholder(tf.float32, shape=(10, 10))  # A square matrix

y = tf.linalg.det(x)

with tf.compat.v1.Session() as sess:
  value_to_feed = sp.rand(10, 10)
  print(sess.run(y, feed_dict={x: value_to_feed}))

  # You can check that if you do not feed the value of x, you get an error.
  #sess.run(y)  ## InvalidArgumentError

 Shapes Revisited

 The Problem

  • Placeholders are commonly used as a slot where you can enter your data for training.
  • Data is typically supplied in batches suitable for use with stochastic gradient descent or some variant thereof.
  • Pretty inconvenient to hard code the batch size.
  • But placeholder definition requires a shape!

 The Solution

  • Allow shapes which are potentially unknown at graph building time but will be known at run time.
  • This is done by setting one or more dimensions in a shape to None.
  • For example, a shape of [None, 4] indicates that we plan to have a matrix with 4 columns but some unknown number of rows.
  • An obvious point: constants cannot be defined with unknown shape.

Let's look at some examples with partially specified shapes for placeholders.


In [0]:
import tensorflow as tf

# Defines a placeholder with unknown number of rows and 2 columns.
x = tf.compat.v1.placeholder(tf.float32, shape=[None, 2])

# You can do almost everything that you can do with a fully specified shape
# tensor. Here we compute the sum of squares of elements of x.
y = tf.reduce_sum(x * x)

with tf.compat.v1.Session() as sess:
  # When evaluating, you can specify any value of x compatible with the shape

  # A 2 x 2 matrix is OK
  print("2x2 input: %s" % sess.run(y, feed_dict={x: [[1, 2], [3, 4]]}))
  
  # A 3 x 2 matrix is also OK
  print("3x2 input: %s" % sess.run(y, feed_dict={x: [[1, 2], [3, 4], [5, 6]]}))
  • This seems absolutely awesome, so is there a downside to this?
  • Yes!
    • Unspecified shapes allow you to write ops which may fail at run time even though
      they seem OK at graph building time as the following example demonstrates.

In [0]:
# Continuation of the previous example. Run that first.

# This seems OK because while a shape of [None, 2] is not always square, it
# could be square. So Tensorflow is OK with it.
z = tf.linalg.det(x * x)

with tf.compat.v1.Session() as sess:
  # With a 2x2 matrix we have no problem evaluating z
  print("Det([2x2]): %s" % sess.run(z, feed_dict={x:[[1, 2], [3, 4]]}))

  # But with 3x2 matrix we obviously get an error
  #print("Det([3x2]): %s" % sess.run(z, feed_dict={x:[[1, 2], [3, 4], [1, 4]]}))

tf.shape vs tensor.get_shape

Earlier we encountered two different ways to get the shape of a tensor. Now we can see the difference between these two.

  • tensor.get_shape(): Returns the statically determined shape of a tensor. It is possible that this is only partially known.
  • tf.shape(tensor): Returns the actual fully specified shape of the tensor but is guaranteed to be known only at run time.

Let's see the difference in action:


In [0]:
x = tf.compat.v1.placeholder(tf.int32, [None, None])

# This is a tensor so we have to evaluate it to get its value.
x_s = tf.shape(x)

with tf.compat.v1.Session() as sess:
  print("Static shape of x: %s" % x.get_shape())
  print("Runtime shape of x: %s" % sess.run(x_s, feed_dict={x: [[1],[2]]}))

Reading Files

  • While data can be fed in through placeholders, it would be still more efficient if we could just ask Tensorflow to directly read from data files.

  • There is a large, well developed framework in TF to do this.

  • To get an idea of the steps involved, tensorflow.org has this to say about it:

A typical pipeline for reading records from files has the following stages:

  1. The list of filenames
  2. Optional filename shuffling
  3. Optional epoch limit
  4. Filename queue
  5. A Reader for the file format
  6. A decoder for a record read by the reader
  7. Optional preprocessing
  8. Example queue
  • However, if you are not setting up a large scale distributed tensorflow job, you can get away with using standard python IO along with placeholders.

In the following example, we read a small csv StringIO object using numpy and bind the data to a placeholder.


In [0]:
# We'll use StringIO (to avoid external file handling in colab) to fake a CSV 
# file containing two integer columns labeled x and y. In reality, you'd be 
# using something like 
# with open("path/to/csv_file") as csv_file:
import numpy as np
from io import StringIO
csv_file = StringIO(u"""x,y
0,1
1,2
2,4
3,8
4,16
5,32""")


x = tf.compat.v1.placeholder(tf.int32, shape=(None))
y = tf.compat.v1.placeholder(tf.int32, shape=(None))

z = x + y

# There are many ways to read the data in using standard python utilities.
# Here we use the numpy method to directly read into a numpy array.
data = np.genfromtxt(csv_file, dtype='int32', delimiter=',', skip_header=True)

print("x: %s" % data[:, 0])
print("y: %s" % data[:, 1])

# Now we can evaluate the tensor z using the loaded data to replace the 
# placeholders x and y
with tf.compat.v1.Session() as sess:
  print("z: %s" % sess.run(z, feed_dict={x: data[:,0], y: data[:, 1]}))