This notebook introduces you to TensorFlow concepts. In most cases, you won't need to worry about low-level operations like we illustrate here, but it's good foundational knowledge for you to have. As you'll see in the following notebooks, TensorFlow provides higher level APIs that enable you to quickly write and experiment with Neural Networks. But first things first. Read on to learn the basics, and try the exercise at the end.

Warm-Up

Review: TensorFlow's Computational Model

TensorFlow uses directed graphs to describe computations. Understanding how this computational model works (and how to write graphs that do what you want) requires a slight change in thinking from normal programming.

To help you form a correct mental model, let's walk through a simple example that performs the equivalent of the following Python code, but in TensorFlow:

x = 1.0
x = x + 1.5

Defining a Graph is Like Creating a Blueprint

The first thing to keep in mind when writing TensorFlow code is that most of your code serves to define the computational graph, which you can think of as a kind of blueprint. As a blueprint, the graph can't do anything -- it isn't until you run ops in the graph (within the context of a Session) that it can actually do work. This will become clearer as we step through this example.

To get things started, we'll import TensorFlow and create the Graph object that will hold variables and operations:

import tensorflow as tf

with tf.Graph().as_default():
  # Our TensorFlow code

Here, the as_default() function simply says that all subsequent Variables and operations should be added to the graph created with the tf.Graph() call.

Next, let's create a constant:

c = tf.constant(1.5)

When Python executes this line of code, it starts defining the blueprint for our graph:

Next, let's create a Variable:

x = tf.Variable(1.0, name="x")

Executing this line creates a slot for a Variable in the graph:

Actually, as you can see, this single line of code creates three things in the graph:

  1. A node for the Variable.
  2. Another constant.
  3. An assign op.

When Python executes the code x = tf.Variable(1.0, name="x"), the Variable is declared, but has not been allocated or initialized. The Variable is also assigned the name we passed in. As you can see in the figure, TensorFlow takes the name and appends ":0" to it to get "x:0" as the final Tensor's name.

It can help to understand how objects are named in TensorFlow. Each operation is given a unique name. In the case of the node for the Variable, we've created an operation of type VariableOp that is named "x" that produces a Tensor. This tensor is a writable reference type (tf.float32_ref) named "x:0".

  • Tensors are named after the operation and are sequentially numbered. A VariableOp always has one output, so the Tensor is named "x:0". Some operations, such as tf.split produce multiple outputs. For example, tf.split(0, 2, input, name='x') would produce two output Tensors named "x:0" and "x:1".

  • If an operation is assigned a name that already exists in the graph, then TensorFlow ensures uniqueness by appending an underscore and number. For example, declaring a second Variable named "x" would yield a name of "x_1:0".

The Python variable x contains a reference to that node, but at this point, you cannot access the value of the Variable in TensorFlow (because it hasn't been instantiated, yet). The addition of the constant and assign op lays the groundwork for initializing the Variable.

The reference returned by assign allows you to access the new value of the Variable. Note that this returned reference must be run to trigger the assign op to run. In this case, the assign op will be automatically run for us when we run the tf.global_variables_initializer() op later.

Now let's specify our addition:

add_op = tf.add(x, c)

The add op is also added to the graph and connected to the previously defined constant and Variable:

Finally, let's assign the result back to the Variable represented by x:

assign_op = tf.assign(x, add_op)

Our graph now looks like this:

If you inspect the graph above, you'll notice that our Variable is attached to two different assign ops, but there is no sense of order of operations for the two assign ops (e.g., which assign op will execute first, and why?). (However, there is an order of operations specified for the other, connected nodes -- for example, tf.add needs to run before the top-most tf.assign op.) Clearly, we want to initialize the Variable before we try to add a number and assign the result back to the Variable. We'll explicitly order these operations below when we call Session.run().

On the topic of initialization, we add one last op to initialize all variables:

init = tf.global_variables_initializer()

This op will cause our variable to be allocated and initialized through the assign() op automatically created for us (the one on the bottom of the figure).

Our final graph looks like this - note the addition of the tf.global_variables_initializer() op:

But, remember, this is still just a blueprint! No computation has yet occurred in the graph -- we still need to run the graph to initialize the Variable, perform the addition, and assign the result back.

To run the graph, we create a Session:

with tf.Session() as sess:
    # ...

Creating a Session brings our blueprint to life:

We can now call our init op to initialize the Variable:

with tf.Session() as sess:
    sess.run(init)

This causes the following subgraph to come to life...

...and initialize the Variable:

As the figure suggests, only the subgraph connected to the tf.global_variables_initializer() op executes.

We can now perform the addition and assignment. However, instead of calling each operation individually, we can simply run the topmost assign op -- TensorFlow will automatically determine the dependencies and execute them. In this case, the assign op depends on the result of the add op, so it runs that first:

sess.run(assign_op)

Calculating the subgraph to run:

Performing the addition:

Performing the assignment back to our Variable:

At this point, we can now retrieve the new value of the Variable, if we'd like:

print(sess.run(x))  # Should print out 2.5

The final code:

import tensorflow as tf

with tf.Graph().as_default() as g:
  x = tf.Variable(1.0, name="x")
  add_op = tf.add(x, tf.constant(1.5))
  assign_op = tf.assign(x, add_op)
  init = tf.global_variables_initializer()
  with tf.Session() as sess:
    sess.run(init)
    sess.run(assign_op)
    print(sess.run(x))

Normally, you won't need to worry about explicitly creating a Graph or Session, as we've done here -- these objects will be created for you automatically by higher-level APIs. And, in many cases, you won't need to worry about defining really low-level operations like we illustrated here. However, it's still useful to have a basic model of how to define computation in TensorFlow, and how TensorFlow actually performs those computations.

In the next section, you'll get practice in writing some basic TensorFlow code.

Exercise: Implement the Fibonacci Sequence in TensorFlow

In this exercise, your goal is to calculate the Fibonacci sequence using tensors, where $fib(n) = fib(n-1) + fib(n-2)$, and $fib(0) = 0$ and $fib(1) = 1$.

Spend about 10 minutes on this exercise, then check the solution

To calculate this sequence, use the following two tensors:

  • fib_seq, a 2x1 2D tensor Variable that represents the latest two values of the Fibonacci sequence (the nth and (n-1)th). Initialize fib_seq with the following two values: $\begin{bmatrix}0.0\\1.0\end{bmatrix}$
  • fib_matrix, a constant 2x2 2D tensor that generates the next entries in the Fibonacci sequence: $\begin{bmatrix}0.0 & 1.0\\1.0 & 1.0\end{bmatrix}$

If you perform a matrix multiplication of fib_matrix and fib_seq, you get the next value in the sequence (the nth and the (n+1)th):

$$\begin{bmatrix}0.0 & 1.0\\1.0 & 1.0\end{bmatrix} \begin{bmatrix}0.0\\1.0\end{bmatrix} = \begin{bmatrix}1.0\\1.0\end{bmatrix}$$

Using matrix multiplication on fib_matrix and the previous result produces the next value: $\begin{bmatrix}1.0\\2.0\end{bmatrix}$. And so on.

In the fibonacci_seq function (below), perform the following steps:

  • Create fib_seq, the 2x1 2D tensor Variable to hold the current values of the Fibonacci sequence.
  • Perform the matrix multiplication of fib_matrix and fib_seq using tf.matmul(), and assign the result back to fib_seq using tf.assign(). Examples here.

Make sure you add the correct tensors to the output_dict so that the computations are actually performed.


In [ ]:
import tensorflow as tf

with tf.Graph().as_default() as g:

    # Add code that will calculate and output the Fibonacci sequence
    # using TF. You will need to make use of tf.matmul() and
    # tf.assign() to perform the multiplications and assign the result
    # back to the variable fib_seq.

    fib_matrix = tf.constant([[0.0, 1.0],
                              [1.0, 1.0]])

    ### SOLUTION START ###
    # Put your solution code here.

    # Step 1.
    # Change this line to initialize fib_seq to a 2x1 TensorFlow
    # tensor *Variable* with the initial values of 0.0 and 1.0. Hint:
    # You'll need to make sure you specify a 2D tensor of shape 2x1,
    # not a 1D tensor. See fib_matrix above (a 2x2 2D tensor) to guide
    # you.
    fib_sequence = None
    
    # Step 2.
    # Change this line to multiply fib_matrix and fib_sequence using tf.matmul()
    next_fib = None
    
    # Step 3.
    # Change this line to assign the result back to fib_sequence using tf.assign()
    assign_op = None
    
    ### SOLUTION END ###
    
    init = tf.global_variables_initializer()
    with tf.Session() as sess:
        sess.run(init)
        for step in range(10):
            sess.run(assign_op)
            print(sess.run(fib_sequence))