In [1]:
%pylab inline
pylab.style.use('ggplot')

import numpy as np
import pandas as pd


Populating the interactive namespace from numpy and matplotlib

The TensorFlow API

TensorFlow provides multiple APIs.

  • The lowest level API, TensorFlow Core, provides you with complete programming control.

  • The higher level APIs are built on top of TensorFlow Core. The higher level APIs make repetitive tasks easier and more consistent between different users. A high-level API like tf.contrib.learn helps you manage data sets, estimators, training and inference.

TensorFlow Components

Tensors

A Tensor is an array of arbitrary dimensions. The number of dimensions is the rank of the tensor.

The TensorFlow Compute Graph

  • A computational graph is a series of TensorFlow operations arranged into a graph of nodes.
  • Each operation takes tensors as input and produces tensors as output.
  • A Tensorflow program consists of two stages

    • Define the compute graph
    • Execute the compute graph (or a subgraph of it)

First TensorFlow Program


In [2]:
import tensorflow as tf

In [3]:
one = tf.constant(1, dtype=np.float64)
two = tf.constant(2, dtype=np.float64)

In [4]:
print(one)


Tensor("Const:0", shape=(), dtype=float64)

In [5]:
print(two)


Tensor("Const_1:0", shape=(), dtype=float64)

In [6]:
three = one + two

In [7]:
print(three)


Tensor("add:0", shape=(), dtype=float64)

In [8]:
three.op


Out[8]:
<tf.Operation 'add' type=Add>

TensorFlow has its own type system. In this system, all standard Python variables are symbols. The symbols are bound to the actual tensors when we execute a TensorFlow graph.


In [9]:
with tf.Session() as sess:
    three_val = sess.run(three)
    print(three_val)


3.0

In [10]:
print(one)
print(two)
print(three)


Tensor("Const:0", shape=(), dtype=float64)
Tensor("Const_1:0", shape=(), dtype=float64)
Tensor("add:0", shape=(), dtype=float64)

Variables

Being able to add constants is great, but it would be so much better if we could use our compute graph to add any two numbers.

A variable maintains state in the graph across calls to run(). We add a variable to the graph by constructing an instance of the class Variable.


In [11]:
x = tf.Variable(10, dtype=np.float64)
y = tf.Variable(16, dtype=np.float64)
add_op = x + y

The Variable constructor requires an initial value. The shape of the initial value becomes the shape of the tensor this Variable points to. Variables generally have a fixed shape, but TensorFlow provides mechanisms to reshape variables.

All the variables in a TensorFlow compute graph must be explictly initialized. Instead of doing this individually, this is usually done in a convenience operation in one go, like so:

init_op = tf.global_variables_initializer()

In [12]:
with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    # Running init_op must be done separately - cannot be bundled with add_op.
    sess.run(init_op)
    add_result = sess.run(add_op)
    print(add_result)


26.0

If we wanted to add two different numbers,


In [13]:
with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    # Running init_op must be done separately - cannot be bundled with add_op.
    sess.run(init_op)
    add_result = sess.run(add_op, feed_dict={x: 20.0, y: 30.0})
    print(add_result)


50.0

PlaceHolders

TensorFlow has another symbolic type called placeholder that does the same job equally well.


In [14]:
x = tf.placeholder(dtype=np.float64)
y = tf.placeholder(dtype=np.float64)
add_op = x + y

with tf.Session() as sess:
    add_result = sess.run(add_op, feed_dict={x: 50.0, y: 70.0})
    print(add_result)


120.0

Note that placeholders

  • Do not require initial values
  • Do not require initialization ops
  • As a result, we must specify the input values via the feed_dict mechanism

Why do we need both placeholder and Variable? In a Machine Learning context, variables are typically tensors that an ML program will 'learn', e.g. the regression coefficients in a regression model, or the weights of a neural network. They are often initialized with random samples from some probability distribution. Placeholders are typically used to represent components of training data.

Another way to look at the difference between these two is from perspective of the TensorFlow graph. We want to be able to serialize/de-serialize the state of the compute graph of a trained model and be able to use it on new input data. All the internal state of the model (again. the weights of an ANN etc.) must be represented in terms of variables. Placeholders are not part of the serialization system.

Using Numpy Matrices

For most Machine Learning programs, we're more interested in 1D/2D array operations.


In [15]:
x = tf.placeholder(shape=[10, 10], dtype=np.float64)
y = tf.placeholder(shape=[10, 10], dtype=np.float64)

matmul_op = tf.matmul(x, y)

x_mat = np.random.rand(10, 10)
y_mat = np.random.rand(10, 10)

with tf.Session() as sess:
    tf_matmul = sess.run(matmul_op, feed_dict={x: x_mat, y: y_mat})

In [16]:
# Compare the result with what we'll get from standard numpy matrix multiplication

np_matmul = np.dot(x_mat, y_mat)
print(np.allclose(tf_matmul, np_matmul))


True

OLS Regression with TensorFlow

Let's solve a more complicated problem with TensorFlow - Ordinary Least Square Regression. We can frame the OLS regression problem as an optimization problem:

Minimize $L = (\hat{y} - y)^T(\hat{y} - y)$, where $\hat{y} = w^TX + b$ is the estimate of y.

In our example, we'll make up some data for the predictor $X$ and the target $y$. In our TensorFlow program, we'll represent $X$ and $y$ with placeholders. We'll also need a variable for the vector of regression coefficient, $w$. We'll initialize w randomly and then use Gradient Descent to update w iteratively as:

$w_{k+1} = w_k - \eta \nabla L_k$

Where $\eta$ is the learning rate and the $\nabla L_k$ is the derivative of the loss function w.r.t $w$ evaluated at $w=w_k$.


In [17]:
x1 = np.random.rand(50)
x2 = np.random.rand(50)

y_ = 2*x1 + 3*x2 + 5

In [18]:
X_data = np.column_stack([x1, x2, np.ones(50)])
y_data = np.atleast_2d(y_).T

In [19]:
X = tf.placeholder(shape=[50, 3], dtype=np.float64)
y = tf.placeholder(shape=[50, 1], dtype=np.float64)

w = tf.Variable(np.random.rand(3, 1), dtype=np.float64)
y_hat = tf.matmul(X, w)
loss_func = tf.reduce_mean(tf.squared_difference(y_hat, y))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.5)
training_problem = optimizer.minimize(loss_func)

with tf.Session() as session:
    init_op = tf.global_variables_initializer()
    session.run(init_op)
    
    for step in range(1, 501):
        feed_dict = {X: X_data, y: y_data}
        session.run(training_problem, feed_dict=feed_dict)
        if step % 50 == 0:
            current_w = np.squeeze(w.eval(session=session))
            print('Result after {} iterations: {}'.format(step, current_w))


Result after 50 iterations: [ 2.01998758  3.00424542  4.98758518]
Result after 100 iterations: [ 2.00089698  3.00055844  4.99926148]
Result after 150 iterations: [ 2.00004779  3.00003861  4.99995629]
Result after 200 iterations: [ 2.00000273  3.00000238  4.99999742]
Result after 250 iterations: [ 2.00000016  3.00000014  4.99999985]
Result after 300 iterations: [ 2.00000001  3.00000001  4.99999999]
Result after 350 iterations: [ 2.  3.  5.]
Result after 400 iterations: [ 2.  3.  5.]
Result after 450 iterations: [ 2.  3.  5.]
Result after 500 iterations: [ 2.  3.  5.]

Saving the Trained Model


In [20]:
import os
# Same as before, but with saving w

X = tf.placeholder(shape=[50, 3], dtype=np.float64)
y = tf.placeholder(shape=[50, 1], dtype=np.float64)

w = tf.Variable(np.random.rand(3, 1), dtype=np.float64, name='w')
y_hat = tf.matmul(X, w)
loss_func = tf.reduce_mean(tf.squared_difference(y_hat, y))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.5)
training_problem = optimizer.minimize(loss_func)

with tf.Session() as session:
    saver = tf.train.Saver()
    init_op = tf.global_variables_initializer()
    session.run(init_op)
    
    for step in range(1, 501):
        feed_dict = {X: X_data, y: y_data}
        session.run(training_problem, feed_dict=feed_dict)
        if step % 50 == 0:
            current_w = np.squeeze(w.eval(session=session))
            print('Result after {} iterations: {}'.format(step, current_w))
    
    os.makedirs(r'C:\Temp\tf_demo', exist_ok=True)
    save_loc = saver.save(session, r'C:\Temp\tf_demo\ols.ckpt')
    print('Model Saved in {}'.format(save_loc))


Result after 50 iterations: [ 2.03282598  3.01199644  4.97713402]
Result after 100 iterations: [ 2.00157604  3.00110217  4.99864274]
Result after 150 iterations: [ 2.00008645  3.0000723   4.99991972]
Result after 200 iterations: [ 2.00000498  3.0000044   4.99999526]
Result after 250 iterations: [ 2.00000029  3.00000026  4.99999972]
Result after 300 iterations: [ 2.00000002  3.00000002  4.99999998]
Result after 350 iterations: [ 2.  3.  5.]
Result after 400 iterations: [ 2.  3.  5.]
Result after 450 iterations: [ 2.  3.  5.]
Result after 500 iterations: [ 2.  3.  5.]
Model Saved in C:\Temp\tf_demo\ols.ckpt

Using the saved Model

See "loading_saved_variables.ipynb".