In [1]:
%pylab inline
pylab.style.use('ggplot')
import numpy as np
import pandas as pd
TensorFlow provides multiple APIs.
The lowest level API, TensorFlow Core, provides you with complete programming control.
The higher level APIs are built on top of TensorFlow Core. The higher level APIs make repetitive tasks easier and more consistent between different users. A high-level API like tf.contrib.learn helps you manage data sets, estimators, training and inference.
A Tensor is an array of arbitrary dimensions. The number of dimensions is the rank of the tensor.
A Tensorflow program consists of two stages
In [2]:
import tensorflow as tf
In [3]:
one = tf.constant(1, dtype=np.float64)
two = tf.constant(2, dtype=np.float64)
In [4]:
print(one)
In [5]:
print(two)
In [6]:
three = one + two
In [7]:
print(three)
In [8]:
three.op
Out[8]:
TensorFlow has its own type system. In this system, all standard Python variables are symbols. The symbols are bound to the actual tensors when we execute a TensorFlow graph.
In [9]:
with tf.Session() as sess:
three_val = sess.run(three)
print(three_val)
In [10]:
print(one)
print(two)
print(three)
Being able to add constants is great, but it would be so much better if we could use our compute graph to add any two numbers.
A variable maintains state in the graph across calls to run()
. We add a variable to the graph by constructing an instance of the class Variable
.
In [11]:
x = tf.Variable(10, dtype=np.float64)
y = tf.Variable(16, dtype=np.float64)
add_op = x + y
The Variable
constructor requires an initial value. The shape of the initial value becomes the shape of the tensor this Variable points to. Variables generally have a fixed shape, but TensorFlow provides mechanisms to reshape variables.
All the variables in a TensorFlow compute graph must be explictly initialized. Instead of doing this individually, this is usually done in a convenience operation in one go, like so:
init_op = tf.global_variables_initializer()
In [12]:
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
# Running init_op must be done separately - cannot be bundled with add_op.
sess.run(init_op)
add_result = sess.run(add_op)
print(add_result)
If we wanted to add two different numbers,
In [13]:
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
# Running init_op must be done separately - cannot be bundled with add_op.
sess.run(init_op)
add_result = sess.run(add_op, feed_dict={x: 20.0, y: 30.0})
print(add_result)
TensorFlow has another symbolic type called placeholder
that does the same job equally well.
In [14]:
x = tf.placeholder(dtype=np.float64)
y = tf.placeholder(dtype=np.float64)
add_op = x + y
with tf.Session() as sess:
add_result = sess.run(add_op, feed_dict={x: 50.0, y: 70.0})
print(add_result)
Note that placeholders
feed_dict
mechanismWhy do we need both placeholder
and Variable
? In a Machine Learning context, variables are typically tensors that an ML program will 'learn', e.g. the regression coefficients in a regression model, or the weights of a neural network. They are often initialized with random samples from some probability distribution. Placeholders are typically used to represent components of training data.
Another way to look at the difference between these two is from perspective of the TensorFlow graph. We want to be able to serialize/de-serialize the state of the compute graph of a trained model and be able to use it on new input data. All the internal state of the model (again. the weights of an ANN etc.) must be represented in terms of variables. Placeholders are not part of the serialization system.
In [15]:
x = tf.placeholder(shape=[10, 10], dtype=np.float64)
y = tf.placeholder(shape=[10, 10], dtype=np.float64)
matmul_op = tf.matmul(x, y)
x_mat = np.random.rand(10, 10)
y_mat = np.random.rand(10, 10)
with tf.Session() as sess:
tf_matmul = sess.run(matmul_op, feed_dict={x: x_mat, y: y_mat})
In [16]:
# Compare the result with what we'll get from standard numpy matrix multiplication
np_matmul = np.dot(x_mat, y_mat)
print(np.allclose(tf_matmul, np_matmul))
Let's solve a more complicated problem with TensorFlow - Ordinary Least Square Regression. We can frame the OLS regression problem as an optimization problem:
Minimize $L = (\hat{y} - y)^T(\hat{y} - y)$, where $\hat{y} = w^TX + b$ is the estimate of y.
In our example, we'll make up some data for the predictor $X$ and the target $y$. In our TensorFlow program, we'll represent $X$ and $y$ with placeholders. We'll also need a variable for the vector of regression coefficient, $w$. We'll initialize w randomly and then use Gradient Descent to update w iteratively as:
$w_{k+1} = w_k - \eta \nabla L_k$
Where $\eta$ is the learning rate and the $\nabla L_k$ is the derivative of the loss function w.r.t $w$ evaluated at $w=w_k$.
In [17]:
x1 = np.random.rand(50)
x2 = np.random.rand(50)
y_ = 2*x1 + 3*x2 + 5
In [18]:
X_data = np.column_stack([x1, x2, np.ones(50)])
y_data = np.atleast_2d(y_).T
In [19]:
X = tf.placeholder(shape=[50, 3], dtype=np.float64)
y = tf.placeholder(shape=[50, 1], dtype=np.float64)
w = tf.Variable(np.random.rand(3, 1), dtype=np.float64)
y_hat = tf.matmul(X, w)
loss_func = tf.reduce_mean(tf.squared_difference(y_hat, y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.5)
training_problem = optimizer.minimize(loss_func)
with tf.Session() as session:
init_op = tf.global_variables_initializer()
session.run(init_op)
for step in range(1, 501):
feed_dict = {X: X_data, y: y_data}
session.run(training_problem, feed_dict=feed_dict)
if step % 50 == 0:
current_w = np.squeeze(w.eval(session=session))
print('Result after {} iterations: {}'.format(step, current_w))
In [20]:
import os
# Same as before, but with saving w
X = tf.placeholder(shape=[50, 3], dtype=np.float64)
y = tf.placeholder(shape=[50, 1], dtype=np.float64)
w = tf.Variable(np.random.rand(3, 1), dtype=np.float64, name='w')
y_hat = tf.matmul(X, w)
loss_func = tf.reduce_mean(tf.squared_difference(y_hat, y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.5)
training_problem = optimizer.minimize(loss_func)
with tf.Session() as session:
saver = tf.train.Saver()
init_op = tf.global_variables_initializer()
session.run(init_op)
for step in range(1, 501):
feed_dict = {X: X_data, y: y_data}
session.run(training_problem, feed_dict=feed_dict)
if step % 50 == 0:
current_w = np.squeeze(w.eval(session=session))
print('Result after {} iterations: {}'.format(step, current_w))
os.makedirs(r'C:\Temp\tf_demo', exist_ok=True)
save_loc = saver.save(session, r'C:\Temp\tf_demo\ols.ckpt')
print('Model Saved in {}'.format(save_loc))