This notebook introduces you to TensorFlow concepts. In most cases, you won't need to worry about low-level operations like we illustrate here, but it's good foundational knowledge for you to have. As you'll see in the following notebooks, TensorFlow provides higher level APIs that enable you to quickly write and experiment with Neural Networks. But first things first. Read on to learn the basics, and try the exercise at the end.
TensorFlow uses directed graphs to describe computations. Understanding how this computational model works (and how to write graphs that do what you want) requires a slight change in thinking from normal programming.
To help you form a correct mental model, let's walk through a simple example that performs the equivalent of the following Python code, but in TensorFlow:
x = 1.0
x = x + 1.5
The first thing to keep in mind when writing TensorFlow code is that most of
your code serves to define the computational graph, which you can think of as
a kind of blueprint. As a blueprint, the graph can't do anything -- it isn't
until you run ops in the graph (within the context of a Session) that it can
actually do work. This will become clearer as we step through this example.
To get things started, we'll import TensorFlow and create the Graph object
that will hold variables and operations:
import tensorflow as tf
with tf.Graph().as_default():
# Our TensorFlow code
Here, the as_default() function simply says that all subsequent Variables
and operations should be added to the graph created with the tf.Graph() call.
Next, let's create a constant:
c = tf.constant(1.5)
When Python executes this line of code, it starts defining the blueprint for our graph:
Next, let's create a Variable:
x = tf.Variable(1.0, name="x")
Executing this line creates a slot for a Variable in the graph:
Actually, as you can see, this single line of code creates three things in the graph:
Variable.When Python executes the code x = tf.Variable(1.0, name="x"), the Variable
is declared, but has not been allocated or initialized. The Variable is also
assigned the name we passed in. As you can see in the figure, TensorFlow takes
the name and appends ":0" to it to get "x:0" as the final Tensor's name.
It can help to understand how objects are named in TensorFlow. Each operation is
given a unique name. In the case of the node for the Variable, we've created
an operation of type VariableOp that is named "x" that produces a Tensor. This
tensor is a writable reference type (tf.float32_ref) named "x:0".
Tensors are named after the operation and are sequentially numbered. A
VariableOp always has one output, so the Tensor is named "x:0". Some
operations, such as tf.split produce multiple outputs. For example,
tf.split(0, 2, input, name='x') would produce two output Tensors named
"x:0" and "x:1".
If an operation is assigned a name that already exists in the graph, then TensorFlow ensures uniqueness by appending an underscore and number. For example, declaring a second Variable named "x" would yield a name of "x_1:0".
The Python variable x contains a reference to that node, but at this
point, you cannot access the value of the Variable in TensorFlow (because it
hasn't been instantiated, yet). The addition of the constant and assign op
lays the groundwork for initializing the Variable.
The reference returned by assign allows you to access the new value of the Variable. Note
that this returned reference must be run to trigger the assign op to run. In
this case, the assign op will be automatically run for us when we run the
tf.global_variables_initializer() op later.
Now let's specify our addition:
add_op = tf.add(x, c)
The add op is also added to the graph and connected to the previously defined
constant and Variable:
Finally, let's assign the result back to the Variable represented by x:
assign_op = tf.assign(x, add_op)
Our graph now looks like this:
If you inspect the graph above, you'll notice that our Variable is attached to
two different assign ops, but there is no sense of order of operations for the
two assign ops (e.g., which assign op will execute first, and
why?). (However, there is an order of operations specified for the other,
connected nodes -- for example, tf.add needs to run before the top-most
tf.assign op.) Clearly, we want to initialize the Variable before we try to
add a number and assign the result back to the Variable. We'll explicitly
order these operations below when we call Session.run().
On the topic of initialization, we add one last op to initialize all variables:
init = tf.global_variables_initializer()
This op will cause our variable to be allocated and initialized through the
assign() op automatically created for us (the one on the bottom of the
figure).
Our final graph looks like this - note the addition of the
tf.global_variables_initializer() op:
But, remember, this is still just a blueprint! No computation has yet occurred
in the graph -- we still need to run the graph to initialize the Variable,
perform the addition, and assign the result back.
To run the graph, we create a Session:
with tf.Session() as sess:
# ...
Creating a Session brings our blueprint to life:
We can now call our init op to initialize the Variable:
with tf.Session() as sess:
sess.run(init)
This causes the following subgraph to come to life...
...and initialize the Variable:
As the figure suggests, only the subgraph connected to the
tf.global_variables_initializer() op executes.
We can now perform the addition and assignment. However, instead of calling each
operation individually, we can simply run the topmost assign op -- TensorFlow
will automatically determine the dependencies and execute them. In this case,
the assign op depends on the result of the add op, so it runs that first:
sess.run(assign_op)
Calculating the subgraph to run:
Performing the addition:
Performing the assignment back to our Variable:
At this point, we can now retrieve the new value of the Variable, if we'd
like:
print(sess.run(x)) # Should print out 2.5
The final code:
import tensorflow as tf
with tf.Graph().as_default() as g:
x = tf.Variable(1.0, name="x")
add_op = tf.add(x, tf.constant(1.5))
assign_op = tf.assign(x, add_op)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
sess.run(assign_op)
print(sess.run(x))
Normally, you won't need to worry about explicitly creating a Graph or
Session, as we've done here -- these objects will be created for you
automatically by higher-level APIs. And, in many cases, you won't
need to worry about defining really low-level operations like we illustrated
here. However, it's still useful to have a basic model of how to define
computation in TensorFlow, and how TensorFlow actually performs those
computations.
In the next section, you'll get practice in writing some basic TensorFlow code.
In this exercise, your goal is to calculate the Fibonacci sequence using tensors, where $fib(n) = fib(n-1) + fib(n-2)$, and $fib(0) = 0$ and $fib(1) = 1$.
Spend about 10 minutes on this exercise, then check the solution
To calculate this sequence, use the following two tensors:
fib_seq, a 2x1 2D tensor Variable that represents the latest two values
of the Fibonacci sequence (the nth and (n-1)th). Initialize fib_seq with
the following two values: $\begin{bmatrix}0.0\\1.0\end{bmatrix}$fib_matrix, a constant 2x2 2D tensor that generates the next entries in
the Fibonacci sequence: $\begin{bmatrix}0.0 & 1.0\\1.0 & 1.0\end{bmatrix}$If you perform a matrix multiplication of fib_matrix and fib_seq, you get
the next value in the sequence (the nth and the (n+1)th):
Using matrix multiplication on fib_matrix and the previous result produces the
next value: $\begin{bmatrix}1.0\\2.0\end{bmatrix}$. And so on.
In the fibonacci_seq function (below), perform the following steps:
fib_seq, the 2x1 2D tensor Variable to hold the current values of
the Fibonacci sequence.fib_matrix and fib_seq using
tf.matmul(), and assign the result back to fib_seq using tf.assign().
Examples here.Make sure you add the correct tensors to the output_dict so that the
computations are actually performed.
In [ ]:
import tensorflow as tf
with tf.Graph().as_default() as g:
# Add code that will calculate and output the Fibonacci sequence
# using TF. You will need to make use of tf.matmul() and
# tf.assign() to perform the multiplications and assign the result
# back to the variable fib_seq.
fib_matrix = tf.constant([[0.0, 1.0],
[1.0, 1.0]])
### SOLUTION START ###
# Put your solution code here.
# Step 1.
# Change this line to initialize fib_seq to a 2x1 TensorFlow
# tensor *Variable* with the initial values of 0.0 and 1.0. Hint:
# You'll need to make sure you specify a 2D tensor of shape 2x1,
# not a 1D tensor. See fib_matrix above (a 2x2 2D tensor) to guide
# you.
fib_sequence = None
# Step 2.
# Change this line to multiply fib_matrix and fib_sequence using tf.matmul()
next_fib = None
# Step 3.
# Change this line to assign the result back to fib_sequence using tf.assign()
assign_op = None
### SOLUTION END ###
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for step in range(10):
sess.run(assign_op)
print(sess.run(fib_sequence))