Low Level TensorFlow, Part III: Layers and Training



In [1]:

    
# import and check version
import tensorflow as tf
# tf can be really verbose
tf.logging.set_verbosity(tf.logging.ERROR)
print(tf.__version__)









    



1.13.0-rc0



In [2]:

    
# a small sanity check, does tf seem to work ok? 
sess = tf.Session()
hello = tf.constant('Hello TF!')
print(sess.run(hello))
sess.close()









    



b'Hello TF!'

Transforming an input to a known output



In [0]:

    
input = [[-1], [0], [1], [2], [3], [4]]
output = [[2], [1], [0], [-1], [-2], [-3]]



In [10]:

    
import matplotlib.pyplot as plt

plt.xlabel('input')
plt.ylabel('output')

plt.plot(input, output, 'kX')









    Out[10]:





[<matplotlib.lines.Line2D at 0x7f5f4030e358>]

relation between input and output is linear



In [13]:

    
plt.plot(input, output)
plt.plot(input, output, 'ro')









    Out[13]:





[<matplotlib.lines.Line2D at 0x7f5f40241da0>]



In [14]:

    
x = tf.constant(input, dtype=tf.float32)
y_true = tf.constant(output, dtype=tf.float32)
y_true









    Out[14]:





<tf.Tensor 'Const_2:0' shape=(6, 1) dtype=float32>

Defining the model to train

untrained single unit (neuron) also outputs a line from same input, although another one

The Artificial Neuron: Foundation of Deep Neural Networks (simplified, more later)

a neuron takes a number of numerical inputs
multiplies each with a weight, sums up all weighted input and
adds bias (constant) to that sum
from this it creates a single numerical output
for one input (one dimension) this would be a description of a line
for more dimensions this describes a hyper plane that can serve as a decision boundary
this is typically expressed as a matrix multplication plus an addition

From single neuron to network in the TensorFlow Playground

https://playground.tensorflow.org/#activation=linear&batchSize=10&dataset=circle&regDataset=reg-plane&learningRate=0.01&regularizationRate=0&noise=0&networkShape=1&seed=0.98437&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false



In [15]:

    
# short version, though harder to inspect
# y_pred = tf.layers.dense(inputs=x, units=1)

# matrix multiplication under the hood
# tf.matmul(x, w) + b
linear_model = tf.layers.Dense(units=1)
y_pred = linear_model(x)
y_pred









    Out[15]:





<tf.Tensor 'dense/BiasAdd:0' shape=(6, 1) dtype=float32>



In [16]:

    
# single neuron and single input: one weight and one bias
# weights and biases are represented as variables
# https://www.tensorflow.org/guide/variables
with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  weights = sess.run(linear_model.trainable_weights)
  print(weights)









    



[array([[-0.30356097]], dtype=float32), array([0.], dtype=float32)]

Output of a single untrained neuron



In [19]:

    
# when you execute this cell, you should see a different line, as the initialization is random
with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  output_pred = sess.run(y_pred)
  print(output_pred)
  weights = sess.run(linear_model.trainable_weights)
  print(weights)
  plt.plot(input, output_pred)
  plt.plot(input, output, 'ro')









    



[[-1.3577541]
 [ 0.       ]
 [ 1.3577541]
 [ 2.7155082]
 [ 4.073262 ]
 [ 5.4310164]]
[array([[1.3577541]], dtype=float32), array([0.], dtype=float32)]

Loss - Mean Squared Error

Loss function is the prerequisite to training. We need an objective to optimize for. We calculate the difference between what we get as output and what we would like to get.

Mean Squared Error

$MSE = {\frac {1}{n}}\sum _{i=1}^{n}(Y_{i}-{\hat {Y_{i}}})^{2}$

https://en.wikipedia.org/wiki/Mean_squared_error



In [21]:

    
loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred)
loss









    Out[21]:





<tf.Tensor 'mean_squared_error/value:0' shape=() dtype=float32>



In [22]:

    
# when this loss is zero (which it is not right now) we get the desired output
with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  print(sess.run(loss))

Minimize Loss by changing parameters of neuron

Move in parameter space in the direction of a descent

https://twitter.com/colindcarroll/status/1090266016259534848

Job of the optimizer



In [23]:

    
# move the parameters of our single neuron in the right direction with a pretty high intensity (learning rate)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train = optimizer.minimize(loss)
train









    Out[23]:





<tf.Operation 'GradientDescent' type=NoOp>



In [0]:

    
losses = []

sess = tf.Session()
sess.run(tf.global_variables_initializer())

# iterations aka epochs, optimizing the parameters of the neuron
for i in range(500):
  
  # executing optimizer and current loss, but only displaying current loss
  _, loss_value = sess.run((train, loss))
  losses.append(loss_value)



In [25]:

    
print(sess.run(loss))









    



4.09069e-05

Learning Curve after training



In [29]:

    
# wet dream of every machine learning person (typically you see a noisy curve only sort of going down)

plt.yscale('log')
plt.ylabel("loss")
plt.xlabel("epochs")

plt.plot(losses)









    Out[29]:





[<matplotlib.lines.Line2D at 0x7f5f828f3e48>]

Line drawn by neuron after training

result after training is not perfect, but almost looks like the same line
https://en.wikipedia.org/wiki/Linear_equation#Slope%E2%80%93intercept_form



In [30]:

    
output_pred = sess.run(y_pred)
print(output_pred)
plt.plot(input, output_pred)
plt.plot(input, output, 'ro')









    



[[ 1.9887948 ]
 [ 0.9915276 ]
 [-0.00573957]
 [-1.0030067 ]
 [-2.000274  ]
 [-2.9975412 ]]






    Out[30]:





[<matplotlib.lines.Line2D at 0x7f5f2c5ecba8>]



In [31]:

    
# single neuron and single input: one weight and one bias
# slope m ~ -1
# y-axis offset y0 ~ 1
# https://en.wikipedia.org/wiki/Linear_equation#Slope%E2%80%93intercept_form

weights = sess.run(linear_model.trainable_weights)
print(weights)









    



[array([[-0.9972672]], dtype=float32), array([0.9915276], dtype=float32)]



In [0]: