A Brief Guide of Using Hybrid Monte Carlo for Gaussian Process Models

Zhenwen Dai and Mu Niu, November 2014



In [1]:

    
import numpy as np
import GPy
%matplotlib inline
#%config InlineBackend.figure_format = 'svg'
from pylab import *

Example 1 HMC Inference for GP Regression

Let's first generate some synthetic data.



In [2]:

    
# Let's make some synthetic data
x = np.linspace(0.,2*np.pi,100)[:,None]
y = -cos(x)+np.random.randn(*x.shape)*0.3+1
_ = plot(x,y,'.')

Let us Make a GP Regression model and give some general prior distributions to model parameters.



In [3]:

    
# Make a GP regression model
m = GPy.models.GPRegression(x,y)
# Give some general prior distributions for model parameters
m.kern.lengthscale.set_prior(GPy.priors.Gamma.from_EV(1.,10.))
m.kern.variance.set_prior(GPy.priors.Gamma.from_EV(1.,10.))
m.likelihood.variance.set_prior(GPy.priors.Gamma.from_EV(1.,10.))
_=m.plot()









    



WARNING: reconstraining parameters GP_regression.rbf.lengthscale
WARNING: reconstraining parameters GP_regression.rbf.variance
WARNING: reconstraining parameters GP_regression.Gaussian_noise.variance

Sampling the posterior distribution of model parameters

Draw 1000 samples from the GP model



In [4]:

    
hmc = GPy.inference.mcmc.HMC(m,stepsize=5e-2)
s = hmc.sample(num_samples=1000) # Burnin
s = hmc.sample(num_samples=1000)

Plot the samples:



In [5]:

    
plot(s)









    Out[5]:





[<matplotlib.lines.Line2D at 0x7f786dbb8f90>,
 <matplotlib.lines.Line2D at 0x7f786dbb8e10>,
 <matplotlib.lines.Line2D at 0x7f786dbb8990>]

Plot the posterior marginal distribution of model parameters:



In [6]:

    
labels = ['kern variance', 'kern lengthscale','noise variance']
samples = s[300:] # cut out the burn-in period
from scipy import stats
xmin = samples.min()
xmax = samples.max()
xs = np.linspace(xmin,xmax,100)
for i in xrange(samples.shape[1]):
    kernel = stats.gaussian_kde(samples[:,i])
    plot(xs,kernel(xs),label=labels[i])
_ = legend()

Plot the model parameters (lengthscale, variance and noise variance) against each other:



In [7]:

    
fig = figure(figsize=(14,4))
ax = fig.add_subplot(131)
_=ax.plot(samples[:,0],samples[:,1],'.')
ax.set_xlabel(labels[0]); ax.set_ylabel(labels[1])
ax = fig.add_subplot(132)
_=ax.plot(samples[:,1],samples[:,2],'.')
ax.set_xlabel(labels[1]); ax.set_ylabel(labels[2])
ax = fig.add_subplot(133)
_=ax.plot(samples[:,0],samples[:,2],'.')
ax.set_xlabel(labels[0]); ax.set_ylabel(labels[2])









    Out[7]:





<matplotlib.text.Text at 0x7f785f2c5c10>

By setting the model parameters to the posterior mean, we can visualize the model fit:



In [8]:

    
# Set the model parameters as the posterior mean
m.kern.variance[:] = samples[:,0].mean()
m.kern.lengthscale[:] = samples[:,1].mean()
m.likelihood.variance[:] = samples[:,2].mean()
print m
_=m.plot()









    



Name : GP regression
Objective : 39.2970574179
Number of Parameters : 3
Number of Optimization Parameters : 3
Updates : True
Parameters:
  GP_regression.           |            value  |  constraints  |     priors   
  rbf.variance             |    1.41818926221  |      +ve      |  Ga(0.1, 0.1)
  rbf.lengthscale          |    1.53693651461  |      +ve      |  Ga(0.1, 0.1)
  Gaussian_noise.variance  |  0.0920819762721  |      +ve      |  Ga(0.1, 0.1)

Sample the posterior distribution of X given some new Y

Given some new observations, inferring the posterior distribution of the corresponding inputs is difficult, because it can lead to multi-modal distributions.

Assume we have a new observation $1.5$, and try to infer its input distribution.



In [9]:

    
y_new = np.array([1.5])[:,None]

Generate the inference model for the new observations. X_new are the MAP estimations by optimizing the log likelihood. As plotted with a red dot, the MAP estimation corresponds to only one of the modes.



In [10]:

    
x_new,mi = m.infer_newX(y_new)
print mi
m.plot()
plot(x_new,y_new,'or')









    



Name : inferenceX
Objective : -12.192425512
Number of Parameters : 1
Number of Optimization Parameters : 1
Updates : True
Parameters:
  inferenceX.  |          value  |  constraints  |  priors
  latent mean  |  [ 2.11646619]  |               |        






    Out[10]:





[<matplotlib.lines.Line2D at 0x7f786dbdc510>]

Draw 10,000 samples from the inference model:



In [11]:

    
hmc_new = GPy.inference.mcmc.HMC(mi,stepsize=2e-1)
s_new = hmc_new.sample(num_samples=10000,hmc_iters=10)

Plot the samples:



In [12]:

    
_ = plot(s_new[:,:])

Plot the marginal distribution of inferred inputs. The two modes of inputs are clearly visible from the sampled posterior distribution.



In [13]:

    
from scipy import stats
samples_new = s_new[:]
xmin = samples_new.min()
xmax = samples_new.max()
xs = np.linspace(xmin,xmax,100)
for i in xrange(samples_new.shape[1]):
    kernel = stats.gaussian_kde(samples_new[:,i])
    plot(xs,kernel(xs))

Example 2 HMC for lengthscale and variance with marathon data

we set prior for lengthscale and variance of kernel. The mean of the prior is close to the result of GP optimisation. we then allow a big variance. In the case below,we set gamma prior to lengthscale and variance. E(lengthscale) = 120, Var(lengthscale)=2000, E(variance) = 25, Var(variance) = 150



In [14]:

    
m = GPy.examples.regression.olympic_marathon_men()
#
#set prior for lengthscale and variance.
m.kern.variance.set_prior(GPy.priors.Gamma.from_EV(25.,150.))
m.kern.lengthscale.set_prior(GPy.priors.Gamma.from_EV(120.,2000.))
print m









    



WARNING: reconstraining parameters GP_regression.rbf.variance
WARNING: reconstraining parameters GP_regression.rbf.lengthscale

Name : GP regression
Objective : 15.5939533439
Number of Parameters : 3
Number of Optimization Parameters : 3
Updates : True
Parameters:
  GP_regression.           |            value  |  constraints  |     priors    
  rbf.variance             |    25.3995048123  |      +ve      |  Ga(4.2, 0.17)
  rbf.lengthscale          |    152.045313268  |      +ve      |  Ga(7.2, 0.06)
  Gaussian_noise.variance  |  0.0485064747583  |      +ve      |

Run HMC

we plot the full length of hmc iteration. The first 5000 could be burn in stage and can be ignored.



In [15]:

    
# initialise hmc
hmc = GPy.inference.mcmc.HMC(m,stepsize=2e-1)
# run hmc
t = hmc.sample(num_samples=20000,hmc_iters=20)

# Sample parameters 
#hmc = GPy.inference.optimization.HMC(m, stepsize=5e-1)
#t = hmc.sample(m_iters=50000,hmc_iters=20)

_=plot(t)



In [16]:

    
print t.mean(axis=0)
print t.std(axis=0)
_=hist(t[:,:2],50)









    



[  2.34246741e+01   1.25455721e+02   5.32988933e-02]
[  1.06955975e+01   3.11140579e+01   1.72613018e-02]

Seaborn

Using Seaborn for plotting distributions over Hyperparameters:



In [17]:

    
import seaborn as sns, pandas as pd
plt.rcParams['text.usetex'] = False



In [18]:

    
df = pd.DataFrame(t, columns=m.parameter_names_flat())
ax = sns.kdeplot(df['rbf.variance'],
                 color="b", shade=True, shade_lowest=False)
ax = sns.kdeplot(df['rbf.lengthscale'],
                 color="r", shade=True, shade_lowest=False)



In [19]:

    
sns.set(style="white", color_codes=True)
_ = sns.jointplot(data=df, x='rbf.variance', y='rbf.lengthscale', kind="hex", 
                  marginal_kws=dict(kde=True, hist=True, kde_kws=dict(shade=False)),
                 stat_func=None
                 )



In [20]:

    
df









    Out[20]:






  
    
      
      rbf.variance
      rbf.lengthscale
      Gaussian_noise.variance
    
  
  
    
      0
      25.698306
      156.070699
      0.059547
    
    
      1
      18.743578
      156.595044
      0.074173
    
    
      2
      13.887465
      152.355175
      0.069016
    
    
      3
      14.155445
      151.787650
      0.081634
    
    
      4
      20.421911
      154.500416
      0.063637
    
    
      5
      30.195630
      152.830979
      0.072222
    
    
      6
      26.109314
      157.151448
      0.055252
    
    
      7
      20.467120
      156.583988
      0.052092
    
    
      8
      30.400506
      152.062560
      0.064315
    
    
      9
      28.352966
      152.437746
      0.055480
    
    
      10
      29.774243
      153.919031
      0.035826
    
    
      11
      33.207275
      151.824560
      0.037272
    
    
      12
      32.003007
      145.061784
      0.057754
    
    
      13
      26.803550
      145.314209
      0.055612
    
    
      14
      25.152094
      138.583572
      0.040753
    
    
      15
      27.634485
      132.632704
      0.049969
    
    
      16
      29.605808
      137.675906
      0.036453
    
    
      17
      34.438591
      136.449935
      0.026122
    
    
      18
      35.887368
      136.237108
      0.028379
    
    
      19
      35.887368
      136.237108
      0.028379
    
    
      20
      43.910838
      133.445562
      0.026832
    
    
      21
      43.485700
      139.080476
      0.033924
    
    
      22
      48.157112
      140.288812
      0.044917
    
    
      23
      53.635800
      130.723554
      0.033170
    
    
      24
      53.635800
      130.723554
      0.033170
    
    
      25
      57.156723
      121.930448
      0.027956
    
    
      26
      53.572664
      124.038815
      0.036399
    
    
      27
      49.700571
      130.319728
      0.045346
    
    
      28
      48.870969
      127.856158
      0.034672
    
    
      29
      47.474023
      127.517905
      0.053423
    
    
      ...
      ...
      ...
      ...
    
    
      19970
      13.297307
      109.510459
      0.040657
    
    
      19971
      13.297307
      109.510459
      0.040657
    
    
      19972
      13.297307
      109.510459
      0.040657
    
    
      19973
      17.929041
      112.442954
      0.059544
    
    
      19974
      24.739505
      113.709543
      0.040424
    
    
      19975
      24.341697
      116.804279
      0.045263
    
    
      19976
      19.742260
      121.770468
      0.046144
    
    
      19977
      25.370468
      119.295514
      0.050279
    
    
      19978
      22.303060
      126.546828
      0.038183
    
    
      19979
      18.741118
      120.461284
      0.033619
    
    
      19980
      13.677802
      126.895327
      0.039083
    
    
      19981
      14.740390
      122.796441
      0.041132
    
    
      19982
      10.440311
      124.437531
      0.034105
    
    
      19983
      18.213002
      129.423301
      0.043406
    
    
      19984
      17.618734
      125.726755
      0.032402
    
    
      19985
      19.329991
      127.194190
      0.050174
    
    
      19986
      18.250197
      126.106561
      0.054068
    
    
      19987
      15.832616
      125.425730
      0.055303
    
    
      19988
      8.600004
      122.485631
      0.054096
    
    
      19989
      13.351543
      121.282395
      0.062515
    
    
      19990
      17.307096
      121.825491
      0.039614
    
    
      19991
      15.615636
      122.234168
      0.043671
    
    
      19992
      15.407677
      121.591582
      0.048710
    
    
      19993
      15.772050
      126.809752
      0.065785
    
    
      19994
      11.103071
      131.416994
      0.087163
    
    
      19995
      13.936923
      128.758211
      0.085912
    
    
      19996
      14.504106
      125.278166
      0.099385
    
    
      19997
      13.472985
      128.396493
      0.083822
    
    
      19998
      12.813494
      127.803197
      0.086454
    
    
      19999
      10.234419
      123.825535
      0.078223
    
  

20000 rows × 3 columns



In [ ]:

	rbf.variance	rbf.lengthscale	Gaussian_noise.variance
0	25.698306	156.070699	0.059547
1	18.743578	156.595044	0.074173
2	13.887465	152.355175	0.069016
3	14.155445	151.787650	0.081634
4	20.421911	154.500416	0.063637
5	30.195630	152.830979	0.072222
6	26.109314	157.151448	0.055252
7	20.467120	156.583988	0.052092
8	30.400506	152.062560	0.064315
9	28.352966	152.437746	0.055480
10	29.774243	153.919031	0.035826
11	33.207275	151.824560	0.037272
12	32.003007	145.061784	0.057754
13	26.803550	145.314209	0.055612
14	25.152094	138.583572	0.040753
15	27.634485	132.632704	0.049969
16	29.605808	137.675906	0.036453
17	34.438591	136.449935	0.026122
18	35.887368	136.237108	0.028379
19	35.887368	136.237108	0.028379
20	43.910838	133.445562	0.026832
21	43.485700	139.080476	0.033924
22	48.157112	140.288812	0.044917
23	53.635800	130.723554	0.033170
24	53.635800	130.723554	0.033170
25	57.156723	121.930448	0.027956
26	53.572664	124.038815	0.036399
27	49.700571	130.319728	0.045346
28	48.870969	127.856158	0.034672
29	47.474023	127.517905	0.053423
...	...	...	...
19970	13.297307	109.510459	0.040657
19971	13.297307	109.510459	0.040657
19972	13.297307	109.510459	0.040657
19973	17.929041	112.442954	0.059544
19974	24.739505	113.709543	0.040424
19975	24.341697	116.804279	0.045263
19976	19.742260	121.770468	0.046144
19977	25.370468	119.295514	0.050279
19978	22.303060	126.546828	0.038183
19979	18.741118	120.461284	0.033619
19980	13.677802	126.895327	0.039083
19981	14.740390	122.796441	0.041132
19982	10.440311	124.437531	0.034105
19983	18.213002	129.423301	0.043406
19984	17.618734	125.726755	0.032402
19985	19.329991	127.194190	0.050174
19986	18.250197	126.106561	0.054068
19987	15.832616	125.425730	0.055303
19988	8.600004	122.485631	0.054096
19989	13.351543	121.282395	0.062515
19990	17.307096	121.825491	0.039614
19991	15.615636	122.234168	0.043671
19992	15.407677	121.591582	0.048710
19993	15.772050	126.809752	0.065785
19994	11.103071	131.416994	0.087163
19995	13.936923	128.758211	0.085912
19996	14.504106	125.278166	0.099385
19997	13.472985	128.396493	0.083822
19998	12.813494	127.803197	0.086454
19999	10.234419	123.825535	0.078223