Learning Objectives
tf.Session()
tf.placeholder()
and feed_dict
tf.train
module for gradient descentEager Execution
Eager mode evaluates operations immediatley and return concrete values immediately. To enable eager mode simply place tf.enable_eager_execution()
at the top of your code. We recommend using eager execution when prototyping as it is intuitive, easier to debug, and requires less boilerplate code.
Graph Execution
Graph mode is TensorFlow's default execution mode (although it will change to eager in TF 2.0). In graph mode operations only produce a symbolic graph which doesn't get executed until run within the context of a tf.Session(). This style of coding is less inutitive and has more boilerplate, however it can lead to performance optimizations and is particularly suited for distributing training across multiple devices. We recommend using delayed execution for performance sensitive production code.
In [ ]:
import tensorflow as tf
print(tf.__version__)
In [ ]:
a = tf.constant(value = [5, 3, 8], dtype = tf.int32)
b = tf.constant(value = [3, -1, 2], dtype = tf.int32)
c = tf.add(x = a, y = b)
print(c)
A graph can be executed in the context of a tf.Session()
. Think of a session as the bridge between the front-end Python API and the back-end C++ execution engine.
Within a session, passing a tensor operation to run()
will cause Tensorflow to execute all upstream operations in the graph required to calculate that value.
In [ ]:
with tf.Session() as sess:
result = sess.run(fetches = c)
print(result)
Can you mix eager and graph execution together?
What if values of a
and b
keep changing? How can you parameterize them so they can be fed in at runtime?
Step 1: Define Placeholders
Define a
and b
using tf.placeholder()
. You'll need to specify the data type of the placeholder, and optionally a tensor shape.
Step 2: Provide feed_dict
Now when invoking run()
within the tf.Session()
, in addition to providing a tensor operation to evaluate, you also provide a dictionary whose keys are the names of the placeholders.
In [ ]:
a = tf.placeholder(dtype = tf.int32, shape = [None])
b = tf.placeholder(dtype = tf.int32, shape = [None])
c = tf.add(x = a, y = b)
with tf.Session() as sess:
result = sess.run(fetches = c, feed_dict = {
a: [3, 4, 5],
b: [-1, 2, 3]
})
print(result)
In [ ]:
X = tf.constant(value = [1,2,3,4,5,6,7,8,9,10], dtype = tf.float32)
Y = 2 * X + 10
print("X:{}".format(X))
print("Y:{}".format(Y))
Using mean squared error, our loss function is: \begin{equation} MSE = \frac{1}{m}\sum_{i=1}^{m}(\hat{Y}_i-Y_i)^2 \end{equation}
$\hat{Y}$ represents the vector containing our model's predictions: \begin{equation} \hat{Y} = w_0X + w_1 \end{equation}
Note below we introduce TF variables for the first time. Unlike constants, variables are mutable.
Browse the official TensorFlow guide on variables for more information on when/how to use them.
Becauase the parameters $w_0$ and $w_1$ will be updated through gradient descent, we need to tell Tensorflow that these values are variables and initialize them accordingly. Have a look at the Tensorflow usage for variables here. Complete the code below to define and initialize the variables w0
and w1
.
In [ ]:
with tf.variable_scope(name_or_scope = "training", reuse = tf.AUTO_REUSE):
w0 = # TODO: Your code goes here
w1 = # TODO: Your code goes here
Y_hat = w0 * X + w1
loss_mse = tf.reduce_mean(input_tensor = (Y_hat - Y)**2)
An optimizer in TensorFlow both calculates gradients and updates weights. In addition to basic gradient descent, TF provides implementations of several more advanced optimizers such as ADAM and FTRL. They can all be found in the tf.train module.
Note below we're not expclictly telling the optimizer which tensors are our weight tensors. So how does it know what to update? Optimizers will update all variables in the tf.GraphKeys.TRAINABLE_VARIABLES
collection. All variables are added to this collection by default. Since our only variables are w0
and w1
, this is the behavior we want. If we had a variable that we didn't want to be added to the collection we would set trainable=false
when creating it.
When performing gradient descent, we must specify the learning rate and which optimizer to use. In the training loop we will create below, we'll pass the learning rate to the optimzer using a feed dictionary. Thus, we need to create a placeholder for the value of the learning rate. You can read more about placeholders in Tensorflow here. Placeholders are used for values that will be fed to the operation later. Complete the code below to create a placeholder for the learning rate.
We also want to specify the optimizer for the training loop we'll perform below. There are Tensorflow implementations of various optimizers. Complete the code below to create an optimizer. You can find the available optimizers in the tf.train Module.
In [ ]:
LEARNING_RATE = # TODO: Your code goes here
optimizer = # TODO: Your code goes here
Finally we are ready to evaluate our training loop in Graph mode. As before, we need to calculate the gradients and update the weights via our optimizer. Complete the code below to call the optimizer using sess.run
. You can read more about using tf.Session()
to execute operations here. Note that you will need to also pass a feed_dict
to specify the learning rate of the optimizer you created above.
After completing this Exercise, compare with the training loop we made for Eager mode in the previous lab.
In [ ]:
STEPS = 1000
with tf.Session() as sess:
sess.run(tf.global_variables_initializer()) # initialize variables
for step in range(STEPS):
#1. Calculate gradients and update weights
# TODO: Your code goes here
#2. Periodically print MSE
if step % 100 == 0:
print("STEP: {} MSE: {}".format(step, sess.run(fetches = loss_mse)))
# Print final MSE and weights
print("STEP: {} MSE: {}".format(STEPS, sess.run(loss_mse)))
print("w0:{}".format(round(float(sess.run(w0)), 4)))
print("w1:{}".format(round(float(sess.run(w1)), 4)))
Copyright 2019 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License