We discussed some of the problems with the variational mean-field optimisation algorithms in a previous post, and showed that the optimisation problem can be solved using the machine learning library theano. Let's try to do the same using tensorflow, which provides a nice optimisation interface.

Here are some simulated data.


In [1]:
%matplotlib inline
import numpy as np
from matplotlib import pyplot as plt
from matplotlib import rcParams
rcParams['figure.figsize'] = (9, 6)
# This module contains a few helper functions
from linear_regression import *

# Generate data and plot it
X, y, theta = generate_data(theta=[1, 2])
plot_data(X, y, theta)

# Use same starting position for all optimisers
n, p = X.shape
mu0 = np.random.normal(0, 1, p)


Recall that the evidence lower bound (ELBO) is given by

$$ \mathcal{L}\left(\mu\right)= - \frac 12\sum_{i=1}^n\left(\sum_{j=k=1}^p X_{ij}\mu_jX_{ik}\mu_k-2y_i\sum_{j=1}^pX_{ij}\mu_j\right). $$

Let's construct a symbolic computation graph in tensorflow to compute the ELBO.


In [2]:
import tensorflow as tf

with tf.Graph().as_default() as graph:
    # Define variables
    t_X = tf.placeholder(tf.float32, name='X')
    t_y = tf.placeholder(tf.float32, name='y')
    t_mu = tf.Variable(mu0.astype(np.float32), name='mu')

    # Compute the linear predictor
    t_predictor = tf.reduce_sum(t_X * t_mu, [1])

    # Compute the elbo
    t_elbo = -0.5 * tf.reduce_sum(t_predictor ** 2) + tf.reduce_sum(t_predictor * t_y)
    
    # Create an optimiser
    optimizer = tf.train.MomentumOptimizer(1e-2 / n, .3)
    train_step = optimizer.minimize(-t_elbo)

tensorflow makes use of fast libraries such as blas behind the scenes, and stores all data in a session that is decoupled from the python kernel. We first need to start such a session and then feed it the data. We repeatedly call the train operation of the optimiser we defined above to improve the fit of the model.


In [5]:
# Start a session
with tf.Session(graph=graph) as sess:
    # Initialise variables
    sess.run(tf.initialize_all_variables())
    
    # Run the optimisation
    trace = []
    elbos = []
    for step in range(50):
        # Call the optimisation and evaluate the current values of mu and the elbo
        _, mu, elbo = sess.run([train_step, t_mu, t_elbo], {t_X: X, t_y: y})
        trace.append(mu)
        elbos.append(elbo)
    
plot_trace(elbos, trace, theta)
pass


Let's plot the trajectory of the optimisation through parameter space.


In [6]:
plot_trajectory(X, y, trace, theta, levels=np.logspace(4, 4.45, 20))


tensorflow and theano are very similar packages and are both aimed at large-scale machine learning. tensorflow still has a bunch of annoying pecularities (such as missing a standard implementation of the dot product), but promises to develop into a great product. It also provides a range of optimisers out of the box. theano is somewhat faster and more stable but optimisation is sometimes tedious because optimisers are not provided.