First Steps With Theano

In which I practice some numpy and learn the boilerplate for writing Theano functions.


In [1]:
%load_ext version_information
%version_information theano, numpy


Out[1]:
SoftwareVersion
Python2.7.6 64bit [GCC 4.8.2]
IPython4.0.1
OSLinux 4.1.13 boot2docker x86_64 with Ubuntu 14.04 trusty
theano0.7.0.dev-30cc6380863b08a3a90ecbe083ddfb629a56161d
numpy1.8.2
Tue Dec 08 20:13:16 2015 UTC

Matrix rows are examples/observations. Matrix columns are features/dimensions of the example.


In [2]:
import numpy as np
my_matrix = np.asarray([[1., 2], [3, 4], [5, 6]])
my_matrix


Out[2]:
array([[ 1.,  2.],
       [ 3.,  4.],
       [ 5.,  6.]])

This is a 3 row, 2 column matrix, as verified with the shape attribute.


In [3]:
my_matrix.shape


Out[3]:
(3, 2)

Access elements as you would in R -- matrix[row_index, col_index] -- except that counting starts at 0.


In [4]:
my_matrix[2, 0]


Out[4]:
5.0

Broadcasting

NumPy does something called broadcasting where it makes a smaller array compatible with a larger array.

Two examples are shown below, along with images from a visual guide to broadcasting.


In [5]:
np.array([1.0, 2.0, 3.0]) * 2


Out[5]:
array([ 2.,  4.,  6.])

In the example above, the scalar 2.0 is stretched into a 3-element array. This example is just like c(1, 2, 3) * 2 in R.


In [6]:
a = np.array([[ 0.0,  0.0,  0.0],
              [10.0, 10.0, 10.0],
              [20.0, 20.0, 20.0],
              [30.0, 30.0, 30.0]])
b = np.array([0.0, 1.0, 2.0])
a + b


Out[6]:
array([[  0.,   1.,   2.],
       [ 10.,  11.,  12.],
       [ 20.,  21.,  22.],
       [ 30.,  31.,  32.]])

This behavior differs from R where it adds the vector columnwise and recycles values:

matrix(rep(0:3 * 10, 3), ncol = 3) + 0:2
#      [,1] [,2] [,3]  ==          [,1]    [,2]    [,3]
# [1,]    0    1    2  ==  [1,]   0 + 0   1 + 1   2 + 2
# [2,]   11   12   10  ==  [2,]  10 + 1  12 + 2  10 + 0
# [3,]   22   20   21  ==  [3,]  22 + 2  20 + 0  21 + 1
# [4,]   30   31   32  ==  [4,]  30 + 0  31 + 1  32 + 2

That explains why the scalar operation matrix * 2 shows the broadcasting behavior in R: The single value is recycled for each cell in the matrix.

The Theano Boilerplate

There's a recipe for doing computations in Theano. The tutorial on the website doesn't really emphasize this recipe until the end of the "Baby Steps" tutorial, so I'll put it front and center.

We want to really fast mathematical computations. We use Theano to create these fast functions. There are three steps to making a function:

  1. Define typed variables.
  2. Create a symbolic expression describing a computation on those variables.
  3. Compile a function mapping the input variables to the expression.

Now, we have a fast function to do that computation.

Adding Two Scalars

Create a function in Theano to add two numbers together.

First, we declare the typed variables. For example, T.dscalar is a 0-dimensional array (scalar) of doubles. Next, we create a symbolic expression for the addition operation, and compile the addition function.


In [7]:
import theano.tensor as T
from theano import function

In [8]:
x = T.dscalar("x")
y = T.dscalar("y")
z = x + y
f = function([x, y], z)

Theano types are not the same as Python classes, although unfortunately we get an object's class by calling the type function.


In [9]:
# the class of x
type(x)


Out[9]:
theano.tensor.var.TensorVariable

In [10]:
# the theano type of x
x.type


Out[10]:
TensorType(float64, scalar)

When we print z, we see that it describes an operation.


In [11]:
z


Out[11]:
Elemwise{add,no_inplace}.0

In [12]:
# Use pp to pretty print the computation associated to z
from theano import pp
pp(z)


Out[12]:
'(x + y)'

We perform the computation in z by wrapping in that function or using its eval method.


In [13]:
f(2, 3)


Out[13]:
array(5.0)

In [14]:
f(16.3, 12.1)


Out[14]:
array(28.4)

In [15]:
z.eval({x: 3, y: 2})


Out[15]:
array(5.0)

Adding two matrices

Create a function to add two matrices


In [16]:
x = T.dmatrix("x")
y = T.dmatrix("y")
z = x + y
f = function([x, y], z)

Theano can work 2D arrays or numpy arrays.


In [17]:
f([[1, 2], [3, 4]], [[10, 20], [30, 40]])


Out[17]:
array([[ 11.,  22.],
       [ 33.,  44.]])

In [18]:
array1 = np.array(range(1, 5)).reshape([2, 2])
array2 = np.array(range(10, 50, 10)).reshape([2, 2])
f(array1, array2)


Out[18]:
array([[ 11.,  22.],
       [ 33.,  44.]])

Exercise

Modify this boilerplate code to compute a ** 2 + b ** 2 + 2 * a * b.

import theano
a = theano.tensor.vector()      # declare variable
out = a + a ** 10               # build symbolic expression
f = theano.function([a], out)   # compile function

In [19]:
a = T.vector()
b = T.vector()
z = a ** 2 + b ** 2 + 2 * a * b
f = function([a, b], z)
f(range(0, 5), range(5, 10))


Out[19]:
array([  25.,   49.,   81.,  121.,  169.])

This matches the R code solution

a <- 0:4
b <- 5:9
a ^ 2 + b ^ 2 + 2 * a * b
#> [1]  25  49  81 121 169