In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as pl
GPy), HODLR (george), celeriteWe know measured flux depends on telescope pointing, detector temperature, etc..., but we don't know how.
Simple linear basis models: over-fitting and/or over-confidence. Much controversy over the years...
For example, in transit spectra, we care only about $R_p/R_*$.
We need to marginalise over the other hyper-parameters to obtain posterior distributions for the parameters of interest.
GPs are readily incorporated into sampling frameworks such as Markov Chain Monte Carlo (MCMC).
Shrinkage (hyper-) priors (Gamma distribution with shape parameter of unity): $$ \mathrm{p}(\eta) = \frac{1}{l} \exp\left(-\frac{\eta}{l}\right) $$ with large $l$ so the prior is uninformative.
Varying airmass: $\sigma = \sigma_0 + a \exp\left[-(t-t_0)/b \right]$
Clouds: step-change in $\sigma$ at fixed point in time
C.f. Tutorial 1 challenge problem.
Used for:
Helps estimate "spottiness" at time of transit observations
Can separate two components: $$ \begin{array}{rcl} \overline{f}_{t,*} & = & \boldsymbol{k}^{\mathrm{T}}_{t,*} (K + \sigma^2 \mathbb{I})^{-1} \boldsymbol{f} \\ \overline{f}_{t,*} & = & \boldsymbol{k}^{\mathrm{T}}_{p,*} (K + \sigma^2 \mathbb{I})^{-1} \boldsymbol{f} \end{array} $$
Can simply concatenate the two output time-series, and define 2-D input $\mathbf{x}_i = (t_i,l_i)$, where $t$ is time and $l$ is a label which identifies which subset the observation belongs to.
The RV curve of a spot looks like the time-derivative of its light curve (Aigrain et al. 2012)
Rajpaul et al. (2015): Extend the concept to other (spectral) indicators of activity
Covariance between observations of $G$ and $\dot{G}$: $$ \begin{array}{rcl} \gamma^{(G,dG)}(t_i,t_j) & = & \frac{\partial}{\partial t} \left. \gamma^{(G,G)}(t,t_j) \right|_{(t=t_i)}\\ \gamma^{(dG,dG)}(t_i,t_j) & = & \frac{\partial}{\partial t'} \, \frac{\partial}{\partial t} \left. \left. \gamma^{(G,G)}(t,t') \right|_{(t=t_i)} \right|_{(t'=t_j)} \end{array} $$ (Similar expressions exist for integral observations)
To get around this:
Also illustrates using logarythmic time $\tau = \log(t + h)$ as an input.
If inputs are regularly sampled and kernel depends on input distances only, then $K$ is Toeplitz (constant along each diagonal).
Toeplitz matrices can be inverted, and their determinant evaluated in $\mathcal{O}(N \log N)$ operations.
However, real astronomical datasets rarely have strictly regular sampling (missing data).
$\mathcal{O}(N^3) \rightarrow \mathcal{O}(NM^2)$, where user defines $M \le N$.
Low-rank approximation of $K$ $$ K_{NN} \approx Q_{NN} = K_{NM} K^{-1}_{MM} K_{MN} $$
For full description see e.g. these slides by James Hensman.
where: $$ \mathrm{p}(\boldsymbol{y}\,|\,\boldsymbol{f}) = \mathcal{N}(\boldsymbol{y}\,|\,\boldsymbol{f},\sigma^2\mathbb{I}), $$
Then use variational approach to approximate $\mathrm{p}(\boldsymbol{y} \,|\, \boldsymbol{u})$, and marginalise $\boldsymbol{u}$, giving approximate likelihood:
$\tilde{\mathrm{p}}(\boldsymbol{y}) = \mathcal{N}(\boldsymbol{y}\,|\,\boldsymbol{0},K_{NM} \, K_{MM}^{-1} \, K_{MN} + \sigma^2\mathbb{I}) \, $ $\exp \sum_{i=1}^{N} \left\{ - \frac{1}{2 \sigma^2} \left( \boldsymbol{k}_{Mi}^\mathrm{T} \, K_{MM}^{-1} \boldsymbol{k}_{Mi}\right) \right\}$.
Never need to invert $K_{NN}$.
Depends only on the covariance function parameters $\boldsymbol{\theta}$, and the inducing points $Z$: jointly optimize $\boldsymbol{\theta}$ and $Z$.
This form of sparse GP is implemented in the GPy package.
Write $K = K_{k} \, K_{k-1} \, K_{k-2} \, \ldots \, K_1 \, K_0$, where $K_{k} \in \mathbb{R}^{N \times N}$ is a block-diagonal matrix with $2^k$ diagonal blocks, each of size $N/2^k \times N/2^k$.
This allows evaluating $K^{-1}$ and $|K|$ in $\mathcal{O}(N \log^2 N)$, regardless of the number of input dimensions.
Includes sparse GPs using variational approx.
High-level modelling and plotting tools.
Both comme with a basic modelling framework (less sophisticated than GPy's).
Easy to combine with MCMC using emcee, or nested sampling using PyMultiNest