In [60]:

    
#configure plotting
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib;matplotlib.rcParams['figure.figsize'] = (8,5)
import GPy
import numpy as np
from matplotlib import pyplot as plt

A brief guide to GP classification with GPy

James Hensman, November 2014

Gaussian Process classification is perhaps the most popular case of a GP model where the likelihood is not conjugate to the prior. this means that we have to approximate inference over the values of the function. There are many approaches to doing this: here we'll start with the populat EP method.

The generative model

To illustate GP classification, we'll consider the generative model of the data. The idea is that there is a hidden function, drawn from a GP, which takescontinuous values. Those values are then squashed through a probit function, and Bernoulli draws are made (taking the values 0 or 1) conditioned on the squashed values.

First we'll set up a kernel from which to draw some latent function values, as well as the positions ($\mathbf X$) at which we get observations.



In [61]:

    
k = GPy.kern.RBF(1, variance=2., lengthscale=0.3)
X = np.random.rand(200,1)

#draw the latent function value
f = np.random.multivariate_normal(np.zeros(200), k.K(X))

plt.plot(X, f, 'bo')









    Out[61]:





[<matplotlib.lines.Line2D at 0xa239a90>]

Now we'll squash the latent function values through the probit. We'll set up a GPy likelihood for this, which contains the squashing probit function as well as assorted mechanisms for doing approximate inference



In [51]:

    
lik = GPy.likelihoods.Bernoulli()
p = lik.gp_link.transf(f)
plt.plot(X, p, 'ro')









    Out[51]:





[<matplotlib.lines.Line2D at 0x7a74090>]

Now binary Bernoulli variables are drawn:



In [52]:

    
Y = lik.samples(f).reshape(-1,1)
plt.plot(X, Y, 'kx', mew=2);plt.ylim(-0.1, 1.1)









    Out[52]:





(-0.1, 1.1)

Inference

Given the observed binary variables, can we recover the latent function, the associated probabilities, and the variance and lengthscale of the GP?

We'll set up a GPy classifier to do this:



In [53]:

    
m = GPy.core.GP(X=X,
                Y=Y, 
                kernel=k, 
                inference_method=GPy.inference.latent_function_inference.expectation_propagation.EP(),
                likelihood=lik)
print m









    



Name                 : gp
Log-likelihood       : -344.034174987
Number of Parameters : 2
Parameters:
  gp.              |  Value  |  Constraint  |  Prior  |  Tied to
  rbf.variance     |    2.0  |     +ve      |         |         
  rbf.lengthscale  |    0.3  |     +ve      |         |

There's a simpler way to build GP classifiers, with some default options like so:



In [54]:

    
m = GPy.models.GPClassification(X,Y)



In [56]:

    
m.plot()
plt.plot(X, p, 'ro')
m.plot_f()
plt.plot(X, f, 'bo')









    Out[56]:





[<matplotlib.lines.Line2D at 0xa20a890>]



In [57]:

    
print m









    



Name                 : gp_classification
Log-likelihood       : -348.63535925
Number of Parameters : 2
Parameters:
  gp_classification.  |      Value       |  Constraint  |  Prior  |  Tied to
  rbf.variance        |   1.72206022016  |     +ve      |         |         
  rbf.lengthscale     |  0.189666441481  |     +ve      |         |



In [ ]: