Introduction to the Gaussian process regression

Solusion to the Practice ploblem

Georgios Karagiannis, Department of Mathematics, Purdue

SURF 2016

July 8, 2016

Objective :

We wish to build a Gaussian Process regression --a probabilistic surrogate model-- in order to be able to emulate the output of the Piston Simulator, with respect to specified inputs.

Related material:

Readings :

- Rasmussen, Carl Edward. "Gaussian processes in machine learning." In Advanced lectures on machine learning, pp. 63-71. Springer Berlin Heidelberg, 2004.  
    - see http://www.GaussianProcess.org/gpml
    - Chapters: 2, 4, 5.1, & 5.4.2

- Slides provided

Software :

- R-cran (https://cran.r-project.org/)
- R packages 
    - DiceKrigin (https://cran.r-project.org/web/packages/DiceKriging/index.html)
    - lhs (https://cran.r-project.org/web/packages/lhs/index.html)
- Roustant, Olivier, David Ginsbourger, and Yves Deville. "DiceKriging, DiceOptim: Two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization." (2012).
    - Plus dependences ...

Catalytic Conversion of Nitrate to Nitrogen model

Example: Catalytic Conversion of Nitrate to Nitrogen

This is Example 3.1 of [Tsilifis, Panagiotis, Ilias Bilionis, Ioannis Katsounaros, and Nicholas Zabaras. "Variational Reformulation of Bayesian Inverse Problems." arXiv preprint arXiv:1410.5522 (2014)].

Consider the catalytic conversion of nitrate ($\mbox{NO}_3^-$) to nitrogen ($\mbox{N}_2$) and other by-products by electrochemical means. The mechanism that is followed is complex and not well understood. The experiment of \cite{katsounaros} confirmed the production of nitrogen ($\mbox{N}_2$), ammonia ($\mbox{NH}_3$), and nitrous oxide ($\mbox{N}_2\mbox{O}$) as final products of the reaction, as well as the intermediate production of nitrite ($\mbox{NO}_2^-$). The time is measured in minutes and the conentrations are measured in $\mbox{mmol}\cdot\mbox{L}^{-1}$. Let's load the data into this notebook using the Pandas Python module:

This inconsistency suggests the existence of an intermediate unobserved reaction product X. [Katsounaros, Ioannis, Maria Dortsiou, Christos Polatides, Simon Preston, Theodore Kypraios, and Georgios Kyriacou. "Reaction pathways in the electrochemical reduction of nitrate on tin." Electrochimica Acta 71 (2012): 270-276.] suggested that the following reaction path shown in the following figure.

The dynamical system associated with the reaction is: $$ \begin{array}{cc} \frac{d \left[\mbox{NO}_3^-\right]}{dt} &= -k_1\left[\mbox{NO}_3^-\right], \\ \frac{d\left[\mbox{NO}_2^-\right]}{dt} &= k_1\left[\mbox{NO}_3^-\right] - (k_2 + k_4 + k_5)[\mbox{NO}_2^-], \\ \frac{d \left[\mbox{X}\right]}{dt} &= k_2 \left[\mbox{NO}_2^-\right] - k_3 [X],\\ \frac{d \left[\mbox{N}_2\right]}{dt} &= k_3 \left[\mbox{X}\right], \\ \frac{d \left[\mbox{NH}_3\right]}{dt} &= k_4 \left[\mbox{NO}_2^-\right],\\ \frac{d \left[\mbox{N}_2O\right]}{dt} &= k_5 \left[\mbox{NO}_2^-\right], \end{array} $$ where $[\cdot]$ denotes the concentration of a quantity, and $k_i > 0$, $i=1,...5$ are the kinetic rate constants.

Computational Model

We will develop a generic computational model for the solution of dynamical systems and we will use it to study the catalysis problem. The code relies on the Fourth-order Runge-Kutta method and is a modified copy of http://www.math-cs.gordon.edu/courses/ma342/python/diffeq.py developed by Jonathan Senning. The code solves:

$$ \begin{array}{ccc} \dot{\mathbf{y}} &=& f(\mathbf{y}, t),\\ \mathbf{y}(0) &=& \mathbf{y}_0. \end{array} $$

The input values are:

Variable	Value
$\xi_1$	$1.35\pm 0.05$
$\xi_2$	$1.65\pm 0.08$
$\xi_3$	$1.34\pm 0.11$
$\xi_4$	$-0.16\pm 0.16$
$\xi_5$	$-3.84\pm 0.20$

The output values of the simulator are the concentrations (in $\mbox{mmol}\cdot\mbox{L}^{-1}$) of $\mbox{NO}_3^-$, $\mbox{NO}_2^-$, X ( unobserved reaction product), $\mbox{N}_2$, $\mbox{NH}_3$, $\mbox{N}_2\mbox{O}$, and $\mbox{NO}_2^-$.

The R code in './catalytic.R' provides a simulator that returns only one output value (selected by the user), given the values of the 5 inputs.

Tsilifis, Panagiotis, Ilias Bilionis, Ioannis Katsounaros, and Nicholas Zabaras. "Variational Reformulation of Bayesian Inverse Problems." arXiv preprint arXiv:1410.5522 (2014)

Software preparation



In [1]:

    
# DOWNLOAD THE R PACKAGES REQUIRED
install.packages('DiceKriging', repos = "http://cran.us.r-project.org")
install.packages('lhs', repos = "http://cran.us.r-project.org")
# install.packages('tcltk', repos = "http://cran.us.r-project.org")
# install.packages('aplpack', repos = "http://cran.us.r-project.org")









    



Installing package into ‘/export/users/gkaragia/R/x86_64-redhat-linux-gnu-library/3.3’
(as ‘lib’ is unspecified)
Warning message:
In install.packages("DiceKriging", repos = "http://cran.us.r-project.org"): installation of package ‘DiceKriging’ had non-zero exit statusInstalling package into ‘/export/users/gkaragia/R/x86_64-redhat-linux-gnu-library/3.3’
(as ‘lib’ is unspecified)
Warning message:
In install.packages("lhs", repos = "http://cran.us.r-project.org"): installation of package ‘lhs’ had non-zero exit status



In [2]:

    
# LOAD THE R PACKAGES REQUIRED
library('lhs')
library('DiceKriging')
# library('tcltk')
# library('aplpack')



In [3]:

    
# THIS IS THE SIMULATOR AND THE MIN AND MAX OF THE INPUTS

source("./catalytic.R") # function: output_1d <- simulator(input_5d, jout=4)
input_min <- c(1.30,  1.57,  1.23, -0.32, -4.04)
input_max <- c(1.40,  1.73,  1.45,  0.00, -3.64)
input_d <- length(input_min)

# par(mfrow=c(2,3))
# for (i in 1:6) {
#     n_data <- 500 ;
#     n_dim <- input_d
#     X_data <- t(input_min + (input_max-input_min)*t(matrix(runif(n_data*n_dim),n_data, n_dim))) ;
#     myfun <-function(xi){ return(simulator(xi,i))}
#     Y_data <- apply(X_data, 1, myfun) ;
#     hist(Y_data)
# }

myfun <- function(xx) {return(simulator(xx, jout=4)) }

# # PLOT THE REAL FUNCTION TO SEE HOW IT LOOKS LIKE

# par(mfrow = c(3,3))
# for (i in 1:input_d) {
#     for ( j in 1:input_d ) 
#         if(i>j) {
#         n.grid <- 100 ;
#         x1.grid <-seq(input_min[i],input_max[i],length.out=n.grid) ;
#         x2.grid <-seq(input_min[j],input_max[j],length.out=n.grid) ;
#         X.grid <- expand.grid( x1=x1.grid,  x2=x2.grid )
#         myfun2d<-function(xx){ 
#                             zz<-0.5*(input_min+input_max) ; 
#                             zz[i]<-xx[1]; zz[j]<-xx[2]; 
#                             return(myfun(zz)) 
#                     }
#         y.grid <- apply(X.grid,1,myfun2d)
#     contour(x1.grid, x2.grid, matrix(y.grid, n.grid, n.grid), 10, 
#             main = "Real function", 
#             xlab = paste("x", as.character(i)), 
#             ylab = paste("x", as.character(j)),
#             xlim = c(input_min[i],input_max[i]), 
#             ylim = c(input_min[j],input_max[j]))
#     }
# }

Generate a training data-set

Generate a training data-set $D={(x_i,y_i);i=1,...,n}$ of size $n=20$ via a LHS.



In [ ]:

Compare different GP regression models

Use the diagnostics we leant, and compare Gaussian process regression models deffering on the prior linear trend structure or the covariance function



In [ ]:



In [ ]:

Choose the 'Best' model



In [ ]:

Check the validity of the chossen 'Best' model



In [ ]:

Print the estimates of the hyperparameters



In [5]:

    
# PRINT THE PARAMETERS

show(myfun_km)









    



Error in show(myfun_km): object 'myfun_km' not found

Plot the posterior GPR mean and variance, as well as the real function (in contour plots)



In [ ]: