In [1]:
# import and check version
import tensorflow as tf
# tf can be really verbose
tf.logging.set_verbosity(tf.logging.ERROR)
print(tf.__version__)


1.13.0-rc0

In [2]:
# a small sanity check, does tf seem to work ok? 
sess = tf.Session()
hello = tf.constant('Hello TF!')
print(sess.run(hello))
sess.close()


b'Hello TF!'

Transforming an input to a known output


In [0]:
input = [[-1], [0], [1], [2], [3], [4]]
output = [[2], [1], [0], [-1], [-2], [-3]]

In [10]:
import matplotlib.pyplot as plt

plt.xlabel('input')
plt.ylabel('output')

plt.plot(input, output, 'kX')


Out[10]:
[<matplotlib.lines.Line2D at 0x7f5f4030e358>]

relation between input and output is linear


In [13]:
plt.plot(input, output)
plt.plot(input, output, 'ro')


Out[13]:
[<matplotlib.lines.Line2D at 0x7f5f40241da0>]

In [14]:
x = tf.constant(input, dtype=tf.float32)
y_true = tf.constant(output, dtype=tf.float32)
y_true


Out[14]:
<tf.Tensor 'Const_2:0' shape=(6, 1) dtype=float32>

Defining the model to train

untrained single unit (neuron) also outputs a line from same input, although another one

The Artificial Neuron: Foundation of Deep Neural Networks (simplified, more later)

  • a neuron takes a number of numerical inputs
  • multiplies each with a weight, sums up all weighted input and
  • adds bias (constant) to that sum
  • from this it creates a single numerical output
  • for one input (one dimension) this would be a description of a line
  • for more dimensions this describes a hyper plane that can serve as a decision boundary
  • this is typically expressed as a matrix multplication plus an addition

From single neuron to network in the TensorFlow Playground

https://playground.tensorflow.org/#activation=linear&batchSize=10&dataset=circle&regDataset=reg-plane&learningRate=0.01&regularizationRate=0&noise=0&networkShape=1&seed=0.98437&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false


In [15]:
# short version, though harder to inspect
# y_pred = tf.layers.dense(inputs=x, units=1)

# matrix multiplication under the hood
# tf.matmul(x, w) + b
linear_model = tf.layers.Dense(units=1)
y_pred = linear_model(x)
y_pred


Out[15]:
<tf.Tensor 'dense/BiasAdd:0' shape=(6, 1) dtype=float32>

In [16]:
# single neuron and single input: one weight and one bias
# weights and biases are represented as variables
# https://www.tensorflow.org/guide/variables
with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  weights = sess.run(linear_model.trainable_weights)
  print(weights)


[array([[-0.30356097]], dtype=float32), array([0.], dtype=float32)]

Output of a single untrained neuron


In [19]:
# when you execute this cell, you should see a different line, as the initialization is random
with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  output_pred = sess.run(y_pred)
  print(output_pred)
  weights = sess.run(linear_model.trainable_weights)
  print(weights)
  plt.plot(input, output_pred)
  plt.plot(input, output, 'ro')


[[-1.3577541]
 [ 0.       ]
 [ 1.3577541]
 [ 2.7155082]
 [ 4.073262 ]
 [ 5.4310164]]
[array([[1.3577541]], dtype=float32), array([0.], dtype=float32)]

Loss - Mean Squared Error

Loss function is the prerequisite to training. We need an objective to optimize for. We calculate the difference between what we get as output and what we would like to get.

Mean Squared Error

$MSE = {\frac {1}{n}}\sum _{i=1}^{n}(Y_{i}-{\hat {Y_{i}}})^{2}$

https://en.wikipedia.org/wiki/Mean_squared_error


In [21]:
loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred)
loss


Out[21]:
<tf.Tensor 'mean_squared_error/value:0' shape=() dtype=float32>

In [22]:
# when this loss is zero (which it is not right now) we get the desired output
with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  print(sess.run(loss))


0.69367313

Minimize Loss by changing parameters of neuron

Move in parameter space in the direction of a descent

https://twitter.com/colindcarroll/status/1090266016259534848

Job of the optimizer


In [23]:
# move the parameters of our single neuron in the right direction with a pretty high intensity (learning rate)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train = optimizer.minimize(loss)
train


Out[23]:
<tf.Operation 'GradientDescent' type=NoOp>

In [0]:
losses = []

sess = tf.Session()
sess.run(tf.global_variables_initializer())

# iterations aka epochs, optimizing the parameters of the neuron
for i in range(500):
  
  # executing optimizer and current loss, but only displaying current loss
  _, loss_value = sess.run((train, loss))
  losses.append(loss_value)

In [25]:
print(sess.run(loss))


4.09069e-05

Learning Curve after training


In [29]:
# wet dream of every machine learning person (typically you see a noisy curve only sort of going down)

plt.yscale('log')
plt.ylabel("loss")
plt.xlabel("epochs")

plt.plot(losses)


Out[29]:
[<matplotlib.lines.Line2D at 0x7f5f828f3e48>]

Line drawn by neuron after training


In [30]:
output_pred = sess.run(y_pred)
print(output_pred)
plt.plot(input, output_pred)
plt.plot(input, output, 'ro')


[[ 1.9887948 ]
 [ 0.9915276 ]
 [-0.00573957]
 [-1.0030067 ]
 [-2.000274  ]
 [-2.9975412 ]]
Out[30]:
[<matplotlib.lines.Line2D at 0x7f5f2c5ecba8>]

In [31]:
# single neuron and single input: one weight and one bias
# slope m ~ -1
# y-axis offset y0 ~ 1
# https://en.wikipedia.org/wiki/Linear_equation#Slope%E2%80%93intercept_form

weights = sess.run(linear_model.trainable_weights)
print(weights)


[array([[-0.9972672]], dtype=float32), array([0.9915276], dtype=float32)]

In [0]: