Barebones example of DNNRegressor in Tensorflow

In this notebook a DNNRegressor is used through TensorFlow's tf.contrib.learn library. The example shows how to generate the feature_columns and feed the input using input_fn argument.


In [1]:
# Used to clear up the workspace.
%reset -f
import numpy as np
import pickle
import tensorflow as tf
from tensorflow.contrib.learn.python.learn.estimators import estimator
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Load the data.
data = pickle.load(open('../data/data-ant.pkl', 'rb'))
observations = data['observations']
actions = data['actions']
# We will only look at the first label column, since multiple regression is not supported for some reason...
actions = actions[:, 0]

# Split the data.
X_train, X_test, y_train, y_test = train_test_split(observations, actions, test_size=10, random_state=42)

num_train = X_train.shape[0]
num_test = X_test.shape[0]

pred_fn and feed_fn functions take lists or numpy arrays as input and generate feature columns or labels. Feature columns takes the form of a dictionary with column names as Keys and tf.constant of columns as Values, while the label is simply a tf.constant of labels.

np.newaxis is added in order to address TensorFlow's warning that the input should be a two instead of one dimensional tensor.


In [2]:
def pred_fn(X):
    return {str("my_col" + str(k)): tf.constant(X[:, k][:, np.newaxis]) for k in range(X.shape[1])}

def input_fn(X, y):
    feature_cols = pred_fn(X)
    label = tf.constant(y)
    
    return feature_cols, label

In [3]:
feature_cols = [tf.contrib.layers.real_valued_column(str("my_col") + str(i)) for i in range(X_train.shape[1])]
# This does not work for some reason.
#feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)

In [4]:
regressor = tf.contrib.learn.DNNRegressor(feature_columns=feature_cols, hidden_units=[100, 100])

regressor.fit(input_fn=lambda: input_fn(X_train, y_train), steps=1000);


WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpidHt5x
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_save_checkpoints_secs': 600, '_num_ps_replicas': 0, '_keep_checkpoint_max': 5, '_tf_random_seed': None, '_task_type': None, '_environment': 'local', '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f9f37fe8350>, '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_task_id': 0, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_evaluation_master': '', '_keep_checkpoint_every_n_hours': 10000, '_master': ''}
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/head.py:1362: scalar_summary (from tensorflow.python.ops.logging_ops) is deprecated and will be removed after 2016-11-30.
Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into /tmp/tmpidHt5x/model.ckpt.
INFO:tensorflow:loss = 0.143963, step = 1
INFO:tensorflow:global_step/sec: 1.11048
INFO:tensorflow:loss = 0.0129359, step = 101
INFO:tensorflow:global_step/sec: 1.10946
INFO:tensorflow:loss = 0.00994792, step = 201
INFO:tensorflow:global_step/sec: 1.48373
INFO:tensorflow:loss = 0.0086202, step = 301
INFO:tensorflow:global_step/sec: 1.41203
INFO:tensorflow:loss = 0.0077492, step = 401
INFO:tensorflow:global_step/sec: 1.48551
INFO:tensorflow:loss = 0.00711447, step = 501
INFO:tensorflow:global_step/sec: 1.48836
INFO:tensorflow:loss = 0.00663014, step = 601
INFO:tensorflow:global_step/sec: 1.47308
INFO:tensorflow:loss = 0.00623549, step = 701
INFO:tensorflow:global_step/sec: 1.46015
INFO:tensorflow:loss = 0.00590518, step = 801
INFO:tensorflow:Saving checkpoints for 817 into /tmp/tmpidHt5x/model.ckpt.
INFO:tensorflow:global_step/sec: 1.42684
INFO:tensorflow:loss = 0.0056193, step = 901
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/tmpidHt5x/model.ckpt.
INFO:tensorflow:Loss for final step: 0.00537238.

In [5]:
pred = list(regressor.predict_scores(input_fn=lambda: pred_fn(X_test)))

print pred
print y_test
print mean_squared_error(pred, y_test)


[0.39723414, -0.027354294, -0.061233871, -0.017296148, -0.37245646, 0.1132348, 0.1976911, -0.1596929, 0.38804257, 0.0017217866]
[ 0.50300872  0.04458803 -0.07244712  0.00861396 -0.49456769 -0.03319729
  0.18001977 -0.25375277  0.25746021 -0.05760179]
0.00832451