----- IMPORTANT ------
The code presented here assumes that you're running TensorFlow v1.3.0 or higher, this was not released yet so the easiet way to run this is update your TensorFlow version to TensorFlow's master.

To do that go here and then execute:
pip install --ignore-installed --upgrade <URL for the right binary for your machine>.

For example, considering a Linux CPU-only running python2:
pip install --upgrade https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-1.2.1-cp27-none-linux_x86_64.whl

Here is walk-through to help getting started with tensorflow

1) Simple Linear Regression with low-level TensorFlow
2) Simple Linear Regression with a canned estimator
3) Playing with real data: linear regressor and DNN
4) Building a custom estimator to classify handwritten digits (MNIST)

What's next?

Dependencies



In [ ]:

    
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import collections

# tensorflow
import tensorflow as tf
print('Expected TensorFlow version is v1.3.0 or higher')
print('Your TensorFlow version:', tf.__version__)

# data manipulation
import numpy as np
import pandas as pd

# visualization
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
matplotlib.rcParams['figure.figsize'] = [12,8]

1) Simple Linear Regression with low-level TensorFlow

Generating data

This function creates a noisy dataset that's roughly linear, according to the equation y = mx + b + noise.

Notice that the expected value for m is 0.1 and for b is 0.3. This is the values we expect the model to predict.



In [ ]:

    
def make_noisy_data(m=0.1, b=0.3, n=100):
    x = np.random.randn(n)
    noise = np.random.normal(scale=0.01, size=len(x))
    y = m * x + b + noise
    return x, y

Create training data



In [ ]:

    
x_train, y_train = make_noisy_data()

Plot the training data



In [ ]:

    
plt.plot(x_train, y_train, 'b.')

The Model



In [ ]:

    
# input and output
x = tf.placeholder(shape=[None], dtype=tf.float32, name='x')
y_label = tf.placeholder(shape=[None], dtype=tf.float32, name='y_label')

# variables
W = tf.Variable(tf.random_normal([1], name="W")) # weight
b = tf.Variable(tf.random_normal([1], name="b")) # bias

# actual model
y = W * x + b

The Loss and Optimizer

Define a loss function (here, squared error) and an optimizer (here, gradient descent).



In [ ]:

    
loss = tf.reduce_mean(tf.square(y - y_label))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
train = optimizer.minimize(loss)

The Training Loop and generating predictions



In [ ]:

    
init = tf.global_variables_initializer()

with tf.Session() as sess:
  sess.run(init) # initialize variables
  for i in range(100): # train for 100 steps
    sess.run(train, feed_dict={x: x_train, y_label:y_train})

  x_plot = np.linspace(-3, 3, 101) # return evenly spaced numbers over a specified interval
  # using the trained model to predict values for the training data
  y_plot = sess.run(y, feed_dict={x: x_plot})

  # saving final weight and bias
  final_W = sess.run(W)
  final_b = sess.run(b)

Visualizing predictions



In [ ]:

    
plt.scatter(x_train, y_train)
plt.plot(x_plot, y_plot, 'g')

What is the final weight and bias?



In [ ]:

    
print('W:', final_W, 'expected: 0.1')
print('b:', final_b, 'expected: 0.3')

2) Simple Linear Regression with a canned estimator

Input Pipeline



In [ ]:

    
x_dict = {'x': x_train}
train_input = tf.estimator.inputs.numpy_input_fn(x_dict, y_train,
                                                 shuffle=True,
                                                 num_epochs=None) # repeat forever

Describe input feature usage



In [ ]:

    
features = [tf.feature_column.numeric_column('x')] # because x is a real number

Build and train the model



In [ ]:

    
estimator = tf.estimator.LinearRegressor(features)
estimator.train(train_input, steps = 1000)

Generating and visualizing predictions



In [ ]:

    
x_test_dict = {'x': np.linspace(-5, 5, 11)}
data_source = tf.estimator.inputs.numpy_input_fn(x_test_dict, shuffle=False)

predictions = list(estimator.predict(data_source))
preds = [p['predictions'][0] for p in predictions]

for y in predictions:
    print(y['predictions'])



In [ ]:

    
plt.scatter(x_train, y_train)
plt.plot(x_test_dict['x'], preds, 'g')

3) Playing with real data: linear regressor and DNN

Get the data

The Adult dataset is from the Census bureau and the task is to predict whether a given adult makes more than $50,000 a year based attributes such as education, hours of work per week, etc.

But the code here presented can be easilly aplicable to any csv dataset that fits in memory.

More about the data here



In [ ]:

    
census_train_url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data'
census_train_path = tf.contrib.keras.utils.get_file('census.train', census_train_url)

census_test_url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test'
census_test_path = tf.contrib.keras.utils.get_file('census.test', census_test_url)

Load the data



In [ ]:

    
column_names = [
  'age', 'workclass', 'fnlwgt', 'education', 'education-num',
  'marital-status', 'occupation', 'relationship', 'race', 'sex',
  'capital-gain', 'capital-loss', 'hours-per-week', 'native-country',
  'income'
]

census_train = pd.read_csv(census_train_path, index_col=False, names=column_names) 
census_test = pd.read_csv(census_train_path, index_col=False, names=column_names) 

census_train_label = census_train.pop('income') == " >50K" 
census_test_label = census_test.pop('income') == " >50K"



In [ ]:

    
census_train.head(10)



In [ ]:

    
census_train_label[:20]

Input pipeline



In [ ]:

    
train_input = tf.estimator.inputs.pandas_input_fn(
    census_train, 
    census_train_label,
    shuffle=True, 
    batch_size = 32, # process 32 examples at a time
    num_epochs=None,
)



In [ ]:

    
test_input = tf.estimator.inputs.pandas_input_fn(
    census_test, 
    census_test_label, 
    shuffle=True, 
    num_epochs=1)



In [ ]:

    
features, labels = train_input()
features

Feature description



In [ ]:

    
features = [
    tf.feature_column.numeric_column('hours-per-week'),
    tf.feature_column.bucketized_column(tf.feature_column.numeric_column('education-num'), list(range(25))),
    tf.feature_column.categorical_column_with_vocabulary_list('sex', ['male','female']),
    tf.feature_column.categorical_column_with_hash_bucket('native-country', 1000),
]



In [ ]:

    
estimator = tf.estimator.LinearClassifier(features, model_dir='census/linear',n_classes=2)



In [ ]:

    
estimator.train(train_input, steps=5000)

Evaluate the model



In [ ]:

    
estimator.evaluate(test_input)

DNN model

Update input pre-processing



In [ ]:

    
features = [
    tf.feature_column.numeric_column('education-num'),
    tf.feature_column.numeric_column('hours-per-week'),
    tf.feature_column.numeric_column('age'),
    tf.feature_column.indicator_column(
        tf.feature_column.categorical_column_with_vocabulary_list('sex',['male','female'])),
    tf.feature_column.embedding_column(  # now using embedding!
        tf.feature_column.categorical_column_with_hash_bucket('native-country', 1000), 10)
]



In [ ]:

    
estimator = tf.estimator.DNNClassifier(hidden_units=[20,20], 
                                       feature_columns=features, 
                                       n_classes=2, 
                                       model_dir='census/dnn')



In [ ]:

    
estimator.train(train_input, steps=5000)



In [ ]:

    
estimator.evaluate(test_input)

Custom Input Pipeline using Datasets API

Read the data



In [ ]:

    
def census_input_fn(path):
    def input_fn():    
        dataset = (
            tf.contrib.data.TextLineDataset(path)
                .map(csv_decoder)
                .shuffle(buffer_size=100)
                .batch(32)
                .repeat())

        columns = dataset.make_one_shot_iterator().get_next()

        income = tf.equal(columns.pop('income')," >50K") 

        return columns, income
    
    return input_fn



In [ ]:

    
csv_defaults = collections.OrderedDict([
  ('age',[0]),
  ('workclass',['']),
  ('fnlwgt',[0]),
  ('education',['']),
  ('education-num',[0]),
  ('marital-status',['']),
  ('occupation',['']),
  ('relationship',['']),
  ('race',['']),
  ('sex',['']),
  ('capital-gain',[0]),
  ('capital-loss',[0]),
  ('hours-per-week',[0]),
  ('native-country',['']),
  ('income',['']),
])



In [ ]:

    
def csv_decoder(line):
  parsed = tf.decode_csv(line, csv_defaults.values())
  return dict(zip(csv_defaults.keys(), parsed))

Try the input function



In [ ]:

    
tf.reset_default_graph()
census_input = census_input_fn(census_train_path)
training_batch = census_input()



In [ ]:

    
with tf.Session() as sess:
    features, high_income = sess.run(training_batch)



In [ ]:

    
print(features['education'])



In [ ]:

    
print(features['age'])



In [ ]:

    
print(high_income)

4) Building a custom estimator to classify handwritten digits (MNIST)

Image from: http://rodrigob.github.io/are_we_there_yet/build/images/mnist.png?1363085077



In [ ]:

    
train,test = tf.contrib.keras.datasets.mnist.load_data()
x_train,y_train = train 
x_test,y_test = test

mnist_train_input = tf.estimator.inputs.numpy_input_fn({'x':np.array(x_train, dtype=np.float32)},
                                                       np.array(y_train,dtype=np.int32),
                                                       shuffle=True,
                                                       num_epochs=None)

mnist_test_input = tf.estimator.inputs.numpy_input_fn({'x':np.array(x_test, dtype=np.float32)},
                                                      np.array(y_test,dtype=np.int32),
                                                      shuffle=True,
                                                      num_epochs=1)

tf.estimator.LinearClassifier



In [ ]:

    
estimator = tf.estimator.LinearClassifier([tf.feature_column.numeric_column('x',shape=784)], 
                                          n_classes=10,
                                          model_dir="mnist/linear")
estimator.train(mnist_train_input, steps = 10000)



In [ ]:

    
estimator.evaluate(mnist_test_input)

Examine the results with TensorBoard

$> tensorboard --logdir mnnist/DNN



In [ ]:

    
estimator = tf.estimator.DNNClassifier(hidden_units=[256],
                                       feature_columns=[tf.feature_column.numeric_column('x',shape=784)], 
                                       n_classes=10,
                                       model_dir="mnist/DNN")
estimator.train(mnist_train_input, steps = 10000)



In [ ]:

    
estimator.evaluate(mnist_test_input)



In [ ]:

    
# Parameters
BATCH_SIZE = 128
STEPS = 10000

A Custom Model



In [ ]:

    
def build_cnn(input_layer, mode):
    with tf.name_scope("conv1"):  
      conv1 = tf.layers.conv2d(inputs=input_layer,filters=32, kernel_size=[5, 5],
                               padding='same', activation=tf.nn.relu)

    with tf.name_scope("pool1"):  
      pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

    with tf.name_scope("conv2"):  
      conv2 = tf.layers.conv2d(inputs=pool1,filters=64, kernel_size=[5, 5],
                               padding='same', activation=tf.nn.relu)

    with tf.name_scope("pool2"):  
      pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

    with tf.name_scope("dense"):  
      pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])
      dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)

    with tf.name_scope("dropout"):  
      is_training_mode = mode == tf.estimator.ModeKeys.TRAIN
      dropout = tf.layers.dropout(inputs=dense, rate=0.4, training=is_training_mode)

    logits = tf.layers.dense(inputs=dropout, units=10)

    return logits



In [ ]:

    
def model_fn(features, labels, mode):
  # Describing the model
  input_layer = tf.reshape(features['x'], [-1, 28, 28, 1])
    
  tf.summary.image('mnist_input',input_layer)
    
  logits = build_cnn(input_layer, mode)
 
  # Generate Predictions
  classes = tf.argmax(input=logits, axis=1)
  predictions = {
      'classes': classes,
      'probabilities': tf.nn.softmax(logits, name='softmax_tensor')
  }

  if mode == tf.estimator.ModeKeys.PREDICT:
    # Return an EstimatorSpec object
    return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

  with tf.name_scope('loss'):
    loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=logits)
  
  loss = tf.reduce_sum(loss)
  tf.summary.scalar('loss', loss)
    
  with tf.name_scope('accuracy'):
    accuracy = tf.cast(tf.equal(tf.cast(classes,tf.int32),labels),tf.float32)
  accuracy = tf.reduce_mean(accuracy)
  tf.summary.scalar('accuracy', accuracy)

  # Configure the Training Op (for TRAIN mode)
  if mode == tf.estimator.ModeKeys.TRAIN:
    train_op = tf.contrib.layers.optimize_loss(
        loss=loss,
        global_step=tf.train.get_global_step(),
        learning_rate=1e-4,
        optimizer='Adam')

    return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions,
                                      loss=loss, train_op=train_op)

  # Configure the accuracy metric for evaluation
  eval_metric_ops = {
      'accuracy': tf.metrics.accuracy(
          classes,
          input=labels)
  }

  return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions,
                                    loss=loss, eval_metric_ops=eval_metric_ops)

Runs estimator



In [ ]:

    
# create estimator
run_config = tf.contrib.learn.RunConfig(model_dir='mnist/CNN')
estimator = tf.estimator.Estimator(model_fn=model_fn, config=run_config)

# train for 10000 steps
estimator.train(input_fn=mnist_train_input, steps=10000)

# evaluate
estimator.evaluate(input_fn=mnist_test_input)

# predict
preds = estimator.predict(input_fn=test_input_fn)

Distributed tensorflow: using experiments



In [ ]:

    
# Run an experiment
from tensorflow.contrib.learn.python.learn import learn_runner

# Enable TensorFlow logs
tf.logging.set_verbosity(tf.logging.INFO)



In [ ]:

    
# create experiment
def experiment_fn(run_config, hparams):
  # create estimator
  estimator = tf.estimator.Estimator(model_fn=model_fn,
                                     config=run_config)
  return tf.contrib.learn.Experiment(
      estimator,
      train_input_fn=train_input_fn,
      eval_input_fn=test_input_fn,
      train_steps=STEPS
  )

# run experiment
learn_runner.run(experiment_fn,
    run_config=run_config)

Examine the results with TensorBoard

$> tensorboard --logdir mnist/CNN