Theory Interlude - Tensor and Flow

First of all: Congratulations! You have made it pretty far and built some advanced machine learning systems already. Before we continue, it is time for a little theory interlude. In this chapter, we will peak under the hood of our systems, and see what is going on there. This knowledge will directly help us build even better systems in the next chapters.

TensorFlow

You might have noticed something peculiar about Keras already. Each time we import it, it gives out a notification about TensorFlow:


In [1]:
import keras


Using TensorFlow backend.
Couldn't import dot_parser, loading of dot files will not be possible.

Keras is a high level library and can be used as a simplified interface to TensorFlow. That means, Keras does not do any computations by itself, it is just a simple way to interact with TensorFlow, which is running in the background.

TensorFlow is a clever piece of software library developed by Google. It is very popular for deep learning. In this material we usually try to work with TensorFlow only through Keras, since that is easier than working with TensorFlow directly. However, sometimes we might want to write a bit of TensorFlow code to build more advanced models.

The goal of TensorFlow is to run the computations needed for deep learning as fast as possible. It does so, as the name gives away, by working with tensors in a data flow graph.

What is a Tensor?

Tensors are arrays of numbers that transform based on specific rules. The simplest kind of tensor is a single number, also called a scalar. Scalars are sometimes referred to as rank zero tensors. The next bigger kind of tensor is a vector, also called rank one tensor. Next higher up the order are matrices, called rank two tensors, cube matrices called rank three tensors and so on.

Rank Name Expresses
0 Scalar Magnitude
1 Vector Magnitude & Direction
2 Matrix Table of numbers
3 Cube Matrix Cube of numbers
n n-dimensional matrix You get the idea

Think back to the first week and how we constructed a neural network from scratch. Take a second to think about the numpy matrices and vectors we used. If you can not remember the exact setup, go back to the early chapters and look up the graphics showing our simple neural net.

Question: What kind of tensors where used in the neural net built from scratch in week 1?

Take a minute to answer the question without looking at the graphic below

As you can see, all matrices and vectors in our neural network are tensors.

What about the Flow?

TensorFlow (and every other deep learning library) performs calculations along a 'computational graph'. In a computational graph, operations, such as a matrix multiplication or an activation function are nodes in a network. Tensors get passed along the edges of the graph between the different operations. A forward pass through our simple neural net has the following graph:

The advantage of structuring computations as a graph is that it is easy to run nodes in parallel. Through parallel computation we do not need one very fast machine, we can also achieve fast computation with many slow ones which split up tasks. This is why graphical processing units (GPU's) are so useful for deep learning. They have many small and cores, other than CPU's which only have a few, but fast cores. While a modern CPU might have 4 cores, a modern GPU can have hundreds or even thousands of cores. The entire graph of just a very simple model can look quite complex, but you can see the components of the dense layer: There is a matrix multiplication (matmul), adding of a bias and a relu activation function:

Derivatives on Computational Graphs

Another advantage of using computational graphs like this is that TensorFlow and other libraries can quickly and automatically calculate derivatives along this graph. As we know from week 1, calculating derivatives is key for training neural networks.

Summary

  • TensorFlow works with Computational Graphs and Tensors
  • Operations are 'nodes' in the computational graph
  • Tensors get passed along the edges of the computational graph
  • TensorFlow can automatically compute derivatives along the computational graph

With this overview of tensors and data flows you are ready to dive deep into advanced models. Let's go!