Gaussian process + HMC

Investigations into and implementation of the paper by Rasmussen:

http://www.kyb.mpg.de/fileadmin/user_upload/files/publications/pdfs/pdf2080.pdf

Gaussian processes

A simple example with the sklearn interface


In [1]:
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import matplotlib.pyplot as plt
import numpy as np

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel


def f_target(x):
    return x * np.sin(x)


def _reshapex(x):
    ''' Reshape an x value to include a second dimension '''
    assert len(x.shape) == 1
    return x.reshape((x.shape[0], 1))


# The points and values which we have sampled
x = np.linspace(1, 8, 6)
y = f_target(x)

# Fit a gaussian process
# The tuples in the arguments to the kernels indicate the valid range of the parameters
# when calling 'fit'
kernel = ConstantKernel(1, (1e-3, 1e3)) * RBF(10, (1e-2, 1e2))
print('Prior kernel:', kernel)

# n_restarts_optimizer is trying to avoid getting stuck in a local minimum when looking for
# hyperparameter values. If the hyperparameters were set *exactly*, then this could be left
# at the default value, which is 0.
regressor = GaussianProcessRegressor(kernel, n_restarts_optimizer=20)
regressor.fit(_reshapex(x), y)
print('MLE kernel  :', regressor.kernel_)


# Evaluate our GP at many more points - include the uncertainty at every point
x_eval = np.linspace(0, 10, 500)
y_eval, sigma_eval = regressor.predict(_reshapex(x_eval), return_std=True)

plt.figure(figsize=(8, 5))
plt.plot(x_eval, f_target(x_eval), label=r'$f(x) = x\,\sin(x)$')
plt.plot(x, y, 'r.', label='Observations')
plt.plot(x_eval, y_eval, 'b-', label='Prediction')
plt.fill(np.concatenate([x_eval, x_eval[::-1]]),
         np.concatenate([(y_eval - 1.96 * sigma_eval),
                         (y_eval + 1.96 * sigma_eval)[::-1]]),
         alpha=.5, fc='b', ec='None', label='95% CI')
plt.xlabel('$x$')
plt.ylabel('$f(x)$')
plt.legend();


Prior kernel: 1**2 * RBF(length_scale=10)
MLE kernel  : 4.96**2 * RBF(length_scale=1.8)

In order to do HMC on a function, we need to compute derivatives. First let's check that we can map terms from the GP regressor to things that are described in the paper. It looks like "kernel_" is a callable that computes the covariance function, and "alpha_" is a constant which corresponds to $C(x, x)^{-1}y$, (where $y$ are the target training values) referring to equation (5) in the paper.


In [2]:
# This is how we get the C(x^*, x) term.
# cov = regressor.kernel_(_reshapex(x_eval), _reshapex(x)).shape

cov_xx = regressor.kernel_(_reshapex(x), _reshapex(x))

# x & y are also available from the regressor as x_train and y_train
alpha2 = np.linalg.inv(cov_xx).dot(y)

# We are just trying to show that these two things are equivalent
print(regressor.alpha_)
print(alpha2)


[-0.0947484   0.10475623  0.35341121 -0.87569887  0.45340547  0.22443768]
[-0.0947484   0.10475623  0.35341121 -0.87569887  0.45340547  0.22443768]

Good, so now we need to compute, from equation (7):

$$ \mathbb{E}\left[\frac{\partial y^*}{\partial x^*_d} \right] = \frac{\partial C(x^*, x)}{\partial x^*_d} C(x, x)^{-1} y. $$

In [3]:
from pypuffin.plaintext import instance_to_string

const_kernel = regressor.kernel_.k1
rbf_kernel = regressor.kernel_.k2
print(instance_to_string(const_kernel, include_type_properties=True))
print(instance_to_string(rbf_kernel, include_type_properties=True))


ConstantKernel<
  bounds: array([[-6.90775528,  6.90775528]]),
  constant_value: 24.578220677188231,
  constant_value_bounds: (0.001, 1000.0),
  hyperparameter_constant_value: Hyperparameter(name='constant_value', value_type='numeric', bounds=array([[  1.00000000e-03,   1.00000000e+03]]), n_elements=1, fixed=False),
  hyperparameters: [Hyperparameter(name='constant_value', value_type='numeric', bounds=array([[  1.00000000e-03,   1.00000000e+03]]), n_elements=1, fixed=False)],
  n_dims: 1,
  theta: array([ 3.20186071]),
>
RBF<
  anisotropic: False,
  bounds: array([[-4.60517019,  4.60517019]]),
  hyperparameter_length_scale: Hyperparameter(name='length_scale', value_type='numeric', bounds=array([[  1.00000000e-02,   1.00000000e+02]]), n_elements=1, fixed=False),
  hyperparameters: [Hyperparameter(name='length_scale', value_type='numeric', bounds=array([[  1.00000000e-02,   1.00000000e+02]]), n_elements=1, fixed=False)],
  length_scale: 1.8019386980490439,
  length_scale_bounds: (0.01, 100.0),
  n_dims: 1,
  theta: array([ 0.58886314]),
>

In [4]:
print(instance_to_string(regressor))


GaussianProcessRegressor<
  L_: array([[  4.95764265e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00],
       [  3.66603875e+00,   3.33742124e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00],
       [  1.48239024e+00,   3.81744140e+00,   2.79425857e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00],
       [  3.27771892e-01,   1.84200201e+00,   3.81399545e+00,
          2.55563173e+00,   0.00000000e+00,   0.00000000e+00],
       [  3.96300420e-02,   4.43363465e-01,   2.00335786e+00,
          3.79727886e+00,   2.43871063e+00,   0.00000000e+00],
       [  2.62011903e-03,   5.59911727e-02,   5.03657345e-01,
          2.08332819e+00,   3.78478599e+00,   2.37834958e+00]]),
  X_train_: array([[ 1. ],
       [ 2.4],
       [ 3.8],
       [ 5.2],
       [ 6.6],
       [ 8. ]]),
  alpha: 1e-10,
  alpha_: array([-0.0947484 ,  0.10475623,  0.35341121, -0.87569887,  0.45340547,
        0.22443768]),
  copy_X_train: True,
  kernel: 1**2 * RBF(length_scale=10),
  kernel_: 4.96**2 * RBF(length_scale=1.8),
  log_marginal_likelihood_value_: -15.043499864697001,
  n_restarts_optimizer: 20,
  normalize_y: False,
  optimizer: 'fmin_l_bfgs_b',
  random_state: None,
  y_train_: array([ 0.84147098,  1.62111163, -2.32505999, -4.59396421,  2.056173  ,
        7.91486597]),
>
/Users/thomas/anaconda/lib/python3.6/site-packages/sklearn/utils/deprecation.py:75: DeprecationWarning: Function rng is deprecated; Attribute rng was deprecated in version 0.19 and will be removed in 0.21.
  warnings.warn(msg, category=DeprecationWarning)
/Users/thomas/anaconda/lib/python3.6/site-packages/sklearn/utils/deprecation.py:75: DeprecationWarning: Function y_train_mean is deprecated; Attribute y_train_mean was deprecated in version 0.19 and will be removed in 0.21.
  warnings.warn(msg, category=DeprecationWarning)

In [5]:
X_eval = _reshapex(x_eval)
X = _reshapex(x)

In [6]:
# This is the different between each evaluation point and each training point, in a vector sense. It is therefore
# a three-component tensor.
# (n_eval, n_train, dim)
x_twiddle = X_eval[:, np.newaxis, :] - X[np.newaxis, :, :]
print(x_twiddle.shape)

# kernel(X_eval, X) --> shape of (n_eval, n_train)
print(rbf_kernel(X_eval, X).shape)

# Therefore the gradient, which looks at the derivative of each component of this covariance matrix to the 
# *corresponding* evaluation vector's coordinates, should have shape
# (n_eval, n_train, dim)
rbf_kernel_derivative = - (1 / rbf_kernel.length_scale ** 2) * x_twiddle * rbf_kernel(X_eval, X)[:, :, np.newaxis]
print(rbf_kernel_derivative.shape)

# Let's just check that this seems like a plausible value for the derivative with a finite-difference approximation
dx = 1e-6
approx_deriv = (rbf_kernel(X_eval + dx, X) - rbf_kernel(X_eval, X)) / dx
print(approx_deriv[0])
print(rbf_kernel_derivative[0].flatten())


(500, 6, 1)
(500, 6)
(500, 6, 1)
[  2.64023880e-01   3.04454206e-01   1.26650069e-01   2.48988862e-02
   2.48266953e-03   1.29272762e-04]
[  2.64023971e-01   3.04454156e-01   1.26650011e-01   2.48988687e-02
   2.48266720e-03   1.29272611e-04]

In [7]:
kernel_derivative = const_kernel.constant_value * rbf_kernel_derivative
print(kernel_derivative.shape)


(500, 6, 1)

This means we have found the term $\frac{\partial C(x^*, x)}{\partial x^*_d}$ All that is left is to multiply through by $\alpha = C(x, x)^{-1} y. $ As seen below, $\alpha$ has shape (n_train,), so we need to broadcast manually in the appropriate directions.


In [8]:
print(regressor.alpha_.shape)

#
gradient = np.sum(kernel_derivative * regressor.alpha_[np.newaxis, :, np.newaxis], axis=1)
gradient_2 = np.tensordot(kernel_derivative, regressor.alpha_, (1, 0))
print(gradient.shape)
print(np.allclose(gradient, gradient_2))

# We should now be the same as using the 'proper' pushed implementation
from pypuffin.sklearn.gaussian_process import gradient_of_mean

gradient_3 = gradient_of_mean(regressor, X_eval)
print(np.allclose(gradient, gradient_3))


(6,)
(500, 1)
True
True

In [9]:
plt.figure(figsize=(8, 5))
plt.plot(x_eval, y_eval, 'b-', label='mean')
plt.plot(x_eval, gradient_3.flatten(), 'r-', label='gradient')
plt.xlabel('$x$')
plt.ylabel('$f(x)$')
plt.legend();


We also need the derivative of the standard deviation - that's because in the exploratory phase we're going to be performing HMC with the potential function of $\mu - \sigma$, that is saying that very uncertain points are favoured. It seems that, for training points $x$ and kernel $C$, and evaluation points $x^*$, we have a vector of uncertainties at each of the training points $\sigma_i$. The expression for this is as below, where summing is explicit, and greek indices are over the training points; roman over the evaluation points.

$$ \sigma_i = \sqrt{C(x^*, x^*)_{ii} - \sum_{\nu,\,\tau} C(x^*, x)_{i\nu} C^{-1}(x, x)_{\nu\tau} C(x^*, x)_{i\tau}}. $$

So... \begin{align} \frac{\partial \sigma_i}{\partial x^*_d} &= \frac{1}{2\sigma_i} \frac{\partial}{\partial x^*_d} \left( C(x^*, x^*)_{ii} - \sum_{\nu,\,\tau} C(x^*, x)_{i\nu} C^{-1}(x, x)_{\nu\tau} C(x^*, x)_{i\tau} \right) \\ &= \frac{1}{2\sigma_i} \left( \frac{\partial C(x^*, x^*)_{ii}}{\partial x^*_d} - 2 \sum_{\nu,\,\tau} \frac{\partial C(x^*, x)_{i\nu}}{\partial x^*_d} C^{-1}(x, x)_{\nu\tau} C(x^*, x)_{i\tau} \right). \end{align}


In [10]:
# Notes from looking through the code in GaussianProcessRegressor.predict(..), under the
# 'return_std' block.

# L is some kind of cholesky decomposition I think...
plt.imshow(regressor.L_);

# ... in particular, it is the cholesky decomposition of the covariance of the training points
print(np.allclose(regressor.L_, np.linalg.cholesky(regressor.kernel_(X, X))))

# This is the diagonal of the covariance matrix
regressor.kernel_.diag(X)


True
Out[10]:
array([ 24.57822068,  24.57822068,  24.57822068,  24.57822068,
        24.57822068,  24.57822068])

In [11]:
from numbers import Real

from pypuffin.decorators import accepts
from pypuffin.types import Callable

@accepts(np.ndarray, np.ndarray, Callable, Callable, eps=Real)
def leapfrog_step(q, p, f_grad_potential, f_grad_kinetic, eps=1e-3):
    ''' Arguments:
            q: ndarray of shape (N,) -- variables of interest
            p: ndarray of shape (N,) -- auxiliary variables
            f_grad_potential: q -> (N,) : gradient of potential wrt. q
            f_grad_kinetic: p -> (N,) : gradient of potential wrt. p
        
        Returns q(t + eps), p(t + eps)
    '''
    p_1 = p - (eps / 2) * f_grad_potential(q)
    q_2 = q + eps * f_grad_kinetic(p_1)
    p_2 = p_1 - (eps / 2) * f_grad_potential(q_2)
    return q_2, p_2

Test the leapfrog integration with a simple graviational potential & kinetic energy term NOTE - the leapfrog step above is actually subtly incorrect. Correct implementation is in pypuffin.numeric.mcmc.hmc. The incorrectness doesn't matter so much for the examples below, but when doing HMC it leads to incorrect sampling behaviour


In [12]:
m = 1
f_grad_kinetic = lambda p: p / m
# For inverse square potential
f_grad_potential = lambda q: q * (q.dot(q) ** (- 3 / 2))
eps = 0.01

q_series = [np.asarray([0.2, 1])]
p_series = [np.asarray([0.6, 0])]

for t in range(1000):
    q, p = q_series[-1], p_series[-1]
    q, p = leapfrog_step(q, p, f_grad_potential, f_grad_kinetic, eps=eps)
    q_series.append(q)
    p_series.append(p)

q_t = np.asarray(q_series).T
# plt.plot(q_t[0], q_t[1], '-')
plt.figure(figsize=(12, 8))
plt.scatter(q_t[0], q_t[1], c=np.arange(q_t.shape[1]))
plt.plot([0], [0], 'ro', ms=10)
plt.colorbar();


Good - it seems that we're getting reasonable energy conservation. Now plug in our GP


In [13]:
m = 1
f_grad_kinetic = lambda p: p / m
# Use our fitted gp as the potential. Note that it expects to take many
# points at once, so pass a list of length 1, and unpack the result
f_grad_potential = lambda q: gradient_of_mean(regressor, np.asarray([q]))[0]
eps = 0.01

q_series = [np.asarray([3])]
p_series = [np.asarray([0.5])]

for t in range(1000):
    q, p = q_series[-1], p_series[-1]
    q, p = leapfrog_step(q, p, f_grad_potential, f_grad_kinetic, eps=eps)
    q_series.append(q)
    p_series.append(p)

q_t = np.asarray(q_series)
# plt.plot(q_t[0], q_t[1], '-')
plt.figure(figsize=(12, 8))
t = np.arange(q_t.shape[0])
# plt.scatter(q_t[:, 0], regressor.predict(q_t) + 0.01 * t, c=t)
plt.scatter(q_t[:, 0], 0.01 * t, c=t)
plt.plot(x_eval, y_eval, 'b-', label='potential')
plt.colorbar()
plt.legend();


Now do a similar thing, but with the target being $\mu-\sigma$ of the GP prediction.


In [14]:
from pypuffin.sklearn.gaussian_process import gradient_of_std

m = 1
f_grad_kinetic = lambda p: p / m
# Use our fitted gp (mu - std) as the potential. Note that it expects to take many
# points at once, so pass a list of length 1, and unpack the result
f_grad_potential = lambda q: (
    gradient_of_mean(regressor, np.asarray([q])) -
    gradient_of_std(regressor, np.asarray([q])))[0]
eps = 0.01

q_series = [np.asarray([3])]
p_series = [np.asarray([0.5])]

for t in range(1000):
    q, p = q_series[-1], p_series[-1]
    q, p = leapfrog_step(q, p, f_grad_potential, f_grad_kinetic, eps=eps)
    q_series.append(q)
    p_series.append(p)

q_t = np.asarray(q_series)
# plt.plot(q_t[0], q_t[1], '-')
plt.figure(figsize=(12, 8))
t = np.arange(q_t.shape[0])
# plt.scatter(q_t[:, 0], regressor.predict(q_t) + 0.01 * t, c=t)
plt.scatter(q_t[:, 0], 0.01 * t, c=t)
plt.plot(x_eval, y_eval - sigma_eval, 'b-', label='potential (mean - std)')
plt.colorbar()
plt.legend();



In [15]:
# And for reference, here is the derivative that we're using for (mean - std). It has some interesting discontinuities
plt.figure(figsize=(8, 5))
plt.plot(x_eval, y_eval, 'g-', label='mean')
plt.plot(x_eval, y_eval - sigma_eval, 'b-', label='mean - std')
plt.plot(x_eval, (gradient_of_mean(regressor, X_eval) - gradient_of_std(regressor, X_eval)).flatten(), 
         'r-', label='gradient')
plt.xlabel('$x$')
plt.ylabel('$f(x)$')
plt.legend();



In [18]:
# Performance could be an issue... it's unfortunate that we are effectively calling our
# gradient function, which is efficient when called with large arrays, with single elements
# inside a python loop.
from pypuffin.decorators import disable_typechecking
from pypuffin.profile import Profiler, Timer

with disable_typechecking(), Profiler() as profiler, Timer() as timer:
    for t in range(1000):
        q, p = q_series[-1], p_series[-1]
        q, p = leapfrog_step(q, p, f_grad_potential, f_grad_kinetic, eps=eps)
        q_series.append(q)
        p_series.append(p)
print(profiler)
print(timer)


         996007 function calls (962007 primitive calls) in 1.375 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
23000/1000    0.029    0.000    1.375    0.001 /Users/thomas/Documents/Programming/pypuffin/pypuffin/decorators.py:219(new_func)
     1000    0.009    0.000    1.373    0.001 <ipython-input-11-84d099e21d30>:6(leapfrog_step)
     2000    0.012    0.000    1.361    0.001 <ipython-input-14-0a4d418383a9>:7(<lambda>)
     2000    0.048    0.000    0.990    0.000 /Users/thomas/Documents/Programming/pypuffin/pypuffin/sklearn/gaussian_process.py:85(gradient_of_std)
18000/6000    0.029    0.000    0.811    0.000 /Users/thomas/Documents/Programming/pypuffin/pypuffin/sklearn/gaussian_process.py:12(_gradient_kernel)
     6000    0.040    0.000    0.791    0.000 /Users/thomas/Documents/Programming/pypuffin/pypuffin/sklearn/gaussian_process.py:51(_gradient_product)
    14000    0.153    0.000    0.628    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/sklearn/gaussian_process/kernels.py:1176(__call__)
     6000    0.084    0.000    0.349    0.000 /Users/thomas/Documents/Programming/pypuffin/pypuffin/sklearn/gaussian_process.py:44(_gradient_rbf)
    14000    0.104    0.000    0.337    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/spatial/distance.py:1838(cdist)
     2000    0.006    0.000    0.328    0.000 /Users/thomas/Documents/Programming/pypuffin/pypuffin/sklearn/gaussian_process.py:65(gradient_of_mean)
     2000    0.029    0.000    0.174    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/linalg/basic.py:227(solve_triangular)
     2000    0.006    0.000    0.117    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/sklearn/gaussian_process/kernels.py:726(__call__)
    92000    0.107    0.000    0.107    0.000 {built-in method numpy.core.multiarray.array}
     8000    0.044    0.000    0.107    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/sklearn/gaussian_process/kernels.py:970(__call__)
    14000    0.027    0.000    0.098    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/sklearn/gaussian_process/kernels.py:36(_check_length_scale)
     2000    0.002    0.000    0.081    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/linalg/lapack.py:469(get_lapack_funcs)
     2000    0.014    0.000    0.079    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/linalg/blas.py:226(_get_funcs)
    40000    0.013    0.000    0.068    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numeric.py:463(asarray)
    22000    0.040    0.000    0.066    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/shape_base.py:63(atleast_2d)
    28000    0.016    0.000    0.065    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/spatial/distance.py:137(_convert_to_double)
     4000    0.017    0.000    0.062    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/_lib/_util.py:192(_asarray_validated)
     2000    0.009    0.000    0.058    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/linalg/blas.py:177(find_best_blas_type)
    12000    0.010    0.000    0.056    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numeric.py:150(ones)
     2000    0.028    0.000    0.053    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numeric.py:1157(tensordot)
    28000    0.009    0.000    0.049    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numeric.py:586(ascontiguousarray)
     2000    0.005    0.000    0.047    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numerictypes.py:964(find_common_type)
    14000    0.017    0.000    0.044    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/_lib/six.py:130(callable)
     4000    0.030    0.000    0.040    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numerictypes.py:942(_can_coerce_all)
     4000    0.012    0.000    0.039    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/lib/function_base.py:1152(asarray_chkfinite)
     4000    0.004    0.000    0.037    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/einsumfunc.py:703(einsum)
    14000    0.023    0.000    0.037    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/fromnumeric.py:1143(squeeze)
     2000    0.005    0.000    0.036    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/sklearn/gaussian_process/kernels.py:760(diag)
     4000    0.032    0.000    0.032    0.000 {built-in method numpy.core.multiarray.c_einsum}
    22000    0.031    0.000    0.031    0.000 {built-in method numpy.core.multiarray.zeros}
    14000    0.019    0.000    0.030    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/spatial/distance.py:141(_filter_deprecated_kwargs)
    12000    0.029    0.000    0.029    0.000 {built-in method numpy.core.multiarray.copyto}
    14000    0.028    0.000    0.028    0.000 {method 'astype' of 'numpy.generic' objects}
    14000    0.012    0.000    0.026    0.000 {built-in method builtins.any}
     4000    0.003    0.000    0.024    0.000 {method 'all' of 'numpy.ndarray' objects}
     2000    0.010    0.000    0.021    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/sklearn/gaussian_process/kernels.py:1012(diag)
     4000    0.002    0.000    0.021    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/_methods.py:40(_all)
    24000    0.009    0.000    0.020    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numeric.py:534(asanyarray)
     4000    0.020    0.000    0.020    0.000 {method 'reduce' of 'numpy.ufunc' objects}
    12000    0.017    0.000    0.017    0.000 {built-in method numpy.core.multiarray.empty}
    14000    0.015    0.000    0.015    0.000 {built-in method scipy.spatial._distance_wrap.cdist_sqeuclidean_wrap}
    42000    0.014    0.000    0.014    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/_lib/six.py:131(<genexpr>)
    88000    0.014    0.000    0.014    0.000 {built-in method builtins.len}
    28000    0.014    0.000    0.014    0.000 {method 'squeeze' of 'numpy.generic' objects}
     6000    0.006    0.000    0.014    0.000 /Users/thomas/Documents/Programming/pypuffin/pypuffin/sklearn/gaussian_process.py:38(_gradient_constant)
     2000    0.010    0.000    0.013    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/lib/twodim_base.py:139(eye)
    60000    0.012    0.000    0.012    0.000 {method 'pop' of 'dict' objects}
     2000    0.002    0.000    0.010    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/sklearn/gaussian_process/kernels.py:363(diag)
     4000    0.009    0.000    0.009    0.000 {built-in method numpy.core.multiarray.dot}
     2000    0.009    0.000    0.009    0.000 {method 'dot' of 'numpy.ndarray' objects}
     2000    0.003    0.000    0.009    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/fromnumeric.py:1204(diagonal)
    26000    0.008    0.000    0.008    0.000 {built-in method builtins.isinstance}
    28000    0.007    0.000    0.007    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numerictypes.py:951(<listcomp>)
    28000    0.007    0.000    0.007    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/fromnumeric.py:2584(ndim)
    28000    0.007    0.000    0.007    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/spatial/distance.py:124(_copy_array_if_base_present)
     6000    0.007    0.000    0.007    0.000 {method 'reshape' of 'numpy.ndarray' objects}
    26000    0.005    0.000    0.005    0.000 {method 'append' of 'list' objects}
     4000    0.005    0.000    0.005    0.000 {built-in method builtins.getattr}
     4000    0.004    0.000    0.004    0.000 {method 'transpose' of 'numpy.ndarray' objects}
    14000    0.004    0.000    0.004    0.000 {method 'lower' of 'str' objects}
     2000    0.003    0.000    0.003    0.000 {method 'diagonal' of 'numpy.ndarray' objects}
    23000    0.003    0.000    0.003    0.000 /Users/thomas/Documents/Programming/pypuffin/pypuffin/decorators.py:17(typechecking_enabled)
     4000    0.002    0.000    0.003    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/sparse/base.py:1111(isspmatrix)
     1000    0.003    0.000    0.003    0.000 <ipython-input-14-0a4d418383a9>:4(<lambda>)
     4000    0.002    0.000    0.002    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/ma/core.py:6192(isMaskedArray)
     6000    0.002    0.000    0.002    0.000 {method 'get' of 'dict' objects}
     2000    0.002    0.000    0.002    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numeric.py:1321(<listcomp>)
     2000    0.001    0.000    0.001    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numerictypes.py:1015(<listcomp>)
     2000    0.001    0.000    0.001    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/linalg/blas.py:206(<listcomp>)
     2000    0.001    0.000    0.001    0.000 {method 'index' of 'list' objects}
     2000    0.001    0.000    0.001    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numeric.py:1327(<listcomp>)
     2000    0.001    0.000    0.001    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numeric.py:1329(<listcomp>)
     2000    0.001    0.000    0.001    0.000 {built-in method builtins.iter}
     2000    0.001    0.000    0.001    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/scipy/linalg/misc.py:169(_datacopied)
     2000    0.000    0.000    0.000    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numerictypes.py:1016(<listcomp>)
     2000    0.000    0.000    0.000    0.000 /Users/thomas/anaconda/lib/python3.6/site-packages/numpy/core/numeric.py:1335(<listcomp>)
        1    0.000    0.000    0.000    0.000 /Users/thomas/Documents/Programming/pypuffin/pypuffin/profile.py:54(__exit__)
        1    0.000    0.000    0.000    0.000 /Users/thomas/Documents/Programming/pypuffin/pypuffin/profile.py:31(__exit__)
        1    0.000    0.000    0.000    0.000 /Users/thomas/Documents/Programming/pypuffin/pypuffin/profile.py:50(__enter__)
        2    0.000    0.000    0.000    0.000 {built-in method time.time}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 /Users/thomas/Documents/Programming/pypuffin/pypuffin/profile.py:46(__init__)



Elapsed time: 0:00:01.378338

Experiments with GPHMC sampling class

Using the class that we have defined, let's redo something like the sampling above.

Actually, starting new notebook: gaussian_process_hmc2

TODO

HMC algorithm - described originally (and well) here:

https://arxiv.org/pdf/1206.1901.pdf