One of the most attractive characteristics of the Bayesian Optimization framework is the amount of choices one has when designing the optimizer. Among these choices, we can include:
All of these are covered to a large extent in pyGPGO, making it, to the best of my knowledge the most comprehensive open-source BO package.
In this tutorial, we will cover some of the offered possibilities when treating covariance function hyperparameters: the marginal likelihood approach and the full-bayesian approach.
We pick Franke's function as our optimization target:
In [1]:
import numpy as np
def f(x, y):
# Franke's function (https://www.mathworks.com/help/curvefit/franke.html)
one = 0.75 * np.exp(-(9 * x - 2)**2/4 - (9 * y - 2)**2/4)
two = 0.75 * np.exp(-(9 * x + 1)**2/49 - (9 * y + 1)/10)
three = 0.5 * np.exp(-(9 * x - 7)**2/4 - (9*y - 3)**2/4)
four = 0.25 * np.exp(-(9 * x - 4)**2 - (9*y - 7)**2)
return one + two + three - four
A simple visualization of the latter:
In [2]:
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
from matplotlib import cm
x = np.linspace(0, 1, num=1000)
y = np.linspace(0, 1, num=1000)
X, Y = np.meshgrid(x, y)
Z = f(X, Y)
fig = plt.figure()
ax = fig.gca(projection='3d')
surf = ax.plot_surface(X, Y, Z, cmap=cm.coolwarm,
linewidth=0)
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.show()
We use the Matérn $\nu=3/2$ as our covariance function, a Gaussian Process surrogate and the expected improvement acquisition function. Notice we call the GaussianProcess class with arguments optimize=True and usegrads=True, which specifies that we want to optimize the marginal log-likelihood using exact gradients (usegrads=False would approximate them).
In [3]:
from pyGPGO.covfunc import matern32
from pyGPGO.acquisition import Acquisition
from pyGPGO.surrogates.GaussianProcess import GaussianProcess
from pyGPGO.GPGO import GPGO
cov = matern32()
gp = GaussianProcess(cov, optimize=True, usegrads=True)
acq = Acquisition(mode='ExpectedImprovement')
param = {'x': ('cont', [0, 1]),
'y': ('cont', [0, 1])}
We start the opimization process as usual and get our results back:
In [4]:
np.random.seed(20)
gpgo = GPGO(gp, acq, f, param)
gpgo.run(max_iter=10)
In [5]:
gpgo.getResult()
Out[5]:
Instead of optimizing the marginal log-likelihood, a Bayesian would take into account the uncertainty of the parameters in the optimization procedure by integrating them. The process here is very similar, but in this case we change our GaussianProcess class by another that implements MCMC sampling (via pyMC3). We use slice sampling in this example for 300 iterations using a burnin of 100 samples.
In [6]:
from pyGPGO.surrogates.GaussianProcessMCMC import GaussianProcessMCMC
import pymc3 as pm
gp = GaussianProcessMCMC(cov, niter=300, burnin=100, step=pm.Slice)
acq = Acquisition(mode='IntegratedExpectedImprovement')
The procedure now is exactly the same as before:
In [7]:
np.random.seed(20)
gpgo = GPGO(gp, acq, f, param)
gpgo.run(max_iter=10)
We can check the posterior distribution of the hyperparameters $\theta$:
In [8]:
gpgo.GP.posteriorPlot()
And finally get our results back:
In [9]:
gpgo.getResult()
Out[9]: