In [1]:
# Initial set-up
import json
import urllib2
import matplotlib
s = json.load ( urllib2.urlopen("https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/raw/master/styles/bmh_matplotlibrc.json"))
matplotlib.rcParams.update(s)
%pylab inline
figsize( 11, 9)
from IPython.display import HTML
def css_styling():
styles = "https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/raw/master/styles/custom.css"
return HTML(styles)
css_styling()
Out[1]:
This notebook presents an example of regularisation/data assimilation that carries on from the previous one, where we showed how DA techniques result in an optimal estimation of the state of a time-varying magnitude, interpolating over data gaps and so on. In this case, rather than applying these ideas to an e.g. NDVI time series, we will use a radiative transfer (RT) model to couple between the observations (directional surface reflectance acquired over a number of bands) and the state (defined in this case in terms of LAI, chlorophyll content, etc). This is a much more useful example, as we are effectively mapping from a reflectance to a quantity of interest, taking into account the radiative transfer physics as well as the uncertainties in the observations.
Assume that we have a set of directional reflectance obervations $R={\rho(\Omega,\Omega', \lambda_i)}$. We will interpret them with an RT model (in this case PROSAIL, but others are available). PROSAIL predicts surface reflectance for a given acquisition geometry $(\Omega,\Omega')$ from 400 to 2500$nm$ as a function of a number of parameters, which are
Some of these parameters are to do with the structure and bottom boundary, and some are to do with the spectral properties of the leaves. Mathematically, we can write what PROSAIL simulates as $$ \rho(\Omega,\Omega', \lambda) = M(\bf{x}, \Omega, \Omega'), $$ where $\bf{x}$ is a vector that stacks the values of the PROSAIL parameters for a geometric configuration $\Omega, \Omega'$. We will be using the PROSAIL python bindings to simulate the reflectance. For example:
In [2]:
import prosail
wv = np.arange ( 400, 2501 )
for lai in np.arange ( 0, 4, .5 ):
brf = prosail.run_prosail( 1.5, 60., 15., 0.1, \
0.016, 0.065, lai, 45., 0, 1., 0.1, 0.001, 30., 0, 0, 2 )
plt.plot ( wv, brf, label="%2.1f" % lai )
plt.legend ( loc='best', fancybox=True )
plt.xlabel("Wavelength [nm]")
plt.ylabel(r'$\rho$')
Out[2]:
The previous plot shows the effect of varying LAI for a given canopy between 0 and 3.5 in steps of 0.5 LAI units, for a given observational geometry. Typically, remote sensing sensors acquire data over discrete spectral bands, so rather than get a whole spectrum, we get discrete estimates of reflectance over parts of the spectrum. For example, MODIS uses 7 bands. If we assume that the bandpass function is constant between the minimum and maximum wavelengths, then we can calculate some useful arrays to simulate MODIS data:
In [3]:
b_min = np.array( [ 620., 841, 459, 545, 1230, 1628, 2105] )
b_max = np.array( [ 670., 876, 479, 565, 1250, 1652, 2155] )
band_pass = np.zeros((7,2101), dtype=np.bool)
n_bands = b_min.shape[0]
bw = np.zeros( n_bands )
bh = np.zeros( n_bands )
for i in xrange( n_bands ):
band_pass[i,:] = np.logical_and ( wv >= b_min[i], \
wv <= b_max[i] )
bw[i] = b_max[i] - b_min[i]
bh[i] = ( b_max[i] + b_min[i] )/2.
plt.axvspan(b_min[i], b_max[i], color='0.8' )
plt.plot ( wv, brf)
plt.xlabel("Wavelenght [nm]")
plt.ylabel("Reflectance")
Out[3]:
The challenge here is to retrieve the parameter vector $\bf{x}$ that gives rise to the observations that we have $\rho(\Omega, \Omega, \lambda_i)$. Since our model $M(\bf{x})$ is non-linear, there is no simple way to obtain that vector directly. The problem is complicated further by the following issues:
In a least squares sense, we can write that the best solution is one that minimises the distance between model and observations, modulated by the uncertainty and the observations, ie
$$ \bf{x}_{opt} = \min_{\bf{x}}\left(\rho(\Omega, \Omega', \lambda) - M(\bf{x}, \Omega, \Omega')\right)^{2}_{\bf{C}_{obs}} $$The main assumption here is that the observational noise is i.i.d Gaussian. In the presence of noise and reduced sensitivity to the observations, families of parameters can provide an acceptable solution (i.e., different values of $\bf{x}$ provide predictions of the observations that fall within the observational error bars), so we typically take a Bayesian approach to this problem and assume that $\bf{X}$ has a p.d.f., and note that we want to find the $p(\bf{X}$ conditioned on the observations $R$, $p(\bf{x}|R)$. However, we only have access to the forward model, that calculates the reflectances given the model parameters (and assumptions about noise), or in other words, $p(R | \bf{x})$. We can use Bayes' Rule to calculate the pdf of interest if we add to this last term a prior term, $p(\bf{x})$, so we have $$ p(\bf{x}|R) \propto p(R | \bf{x})\cdot p(\bf{x}) $$
The prior term is the pdf of the elements of $\bf{x}$ before we have any data. So one fairly simplistic approach is to say that we expect the elements of $\bf{x}$ are normally distributed, with a given mean and covariance structure. However, if $\bf{x}$ relates to a time varying magnigude (for example, the temporal evolution of LAI), we might also want to add a prior constraint that encodes a belief of smooth transition, or in other words, that a priori, we expect that the value of LAI today will be the same as tomorrow (but with an uncertainty on this belief). We can then assume that the two parts of the prior are independent and encode them as a product. The first part is that we penalise solutions that deviate from a prior distribution characterised by $\bf{x}_{prior}$ and $\bf{C}_{prior}$: $$ p(\bf{x})\propto \exp\left[-\frac{1}{2}(\bf{x}-\bf{x}_{prior})^{\top}\bf{C}_{prior}^{-1} (\bf{x}-\bf{x}_{prior}) \right] $$
The second term can be encoded as the first order differences following a 0-mean normal distribution with some variance $\sigma_{smooth}^{2}$. In a Kalman filter sense, we have that the model we use to update the state is just identity, with a model error given by $\sigma_{smooth}^{2}$:
$$ p(x)\propto \exp\left[-\frac{1}{2}\frac{(x_{t}-x_{t-1})^{2}}{\sigma_{smooth}^{2}} \right] $$Plugging all these functions in Bayes' Rule, and taking logs, we end up with the following function $J(\bf{x})$:
$$ \begin{align} J(\bf{x}) =& \left[-\frac{1}{2}(\bf{x}-\bf{x}_{prior})^{\top}\bf{C}_{prior}^{-1} (\bf{x}-\bf{x}_{prior}) \right]\\ +& \left[-\frac{1}{2}\frac{(x_{t}-x_{t-1})^{2}}{\sigma_{smooth}^{2}} \right]\\ +& \left[-\frac{1}{2}(\bf{x}-M(\bf{x}))^{\top}\bf{C}_{obs}^{-1} (\bf{x}-M(\bf{x}) \right]\\ \end{align} $$The minimum of that function is the maximum probability of the posterior distribution $p(\bf{x}|R)$, or the most likely solution that (i) fits the data (using our RT model to map from reflectance to state) and (ii) takes into account the prior information both of state value (and correlations) and dynamic evolution. Note that this latter term could also be provided by some expectation of the state from a vegetation model, at least for the parts of the model that the vegetation model has in common with our state (for example, LAI), and taking into account that they must be defined coherently.
eoldas_ng
provides a way to solve the minimisation problem easily. There are two main challenges:
We solve both these problems by using Gaussian Process emulators (GPs). These emulators use a number of RT model outputs to generate a non-linear statistical model that simulates what the RT model would produce (with an uncertainty estimate). These functions have simple forms for their partial derivatives that we can exploit too.
In this example, we will be using a time series of synthetic MODIS-like data that has been generated with some particular parameter temporal trajectories. We will then use these observations and the DA system to solve for the true state.
In particular, we will first have a system where MODIS observations are produced every 7 days, but initially always with the same acquisition geometry. Data gaps will be be simulated and typical noise added. All other parameters will be kept fixed to their default values. In eoldas_ng
, we call time varying parameters VARIABLE
.
We will use the PROSAIL model but also create GP emulators of the model for convenience.
Let's set up the basics. We need to use a number of OrderedDict
to configure the variables of the state that we will use. Only LAI will change as a function of time, whereas all the other variables will be FIXED
to their default value. We need to define these defaults, as well as parameter boundaries.
Finally, we will also define some transformations of some parameters that are useful in that they quasi-linearise the model. All these terms are defined as OrderedDict
objects, and are then passed to the State
class that will bound them together:
In [5]:
from collections import OrderedDict
try:
from operators import *
from state import *
except ImportError:
import sys
sys.path.append ( "../eoldas_ng/")
from operators import *
from state import *
state_config = OrderedDict ()
state_config['n'] = FIXED
state_config['cab'] = VARIABLE
state_config['car'] = FIXED
state_config['cbrown'] = FIXED
state_config['cw'] = VARIABLE
state_config['cm'] = FIXED
state_config['lai'] = VARIABLE
state_config['ala'] = FIXED
state_config['bsoil'] = FIXED
state_config['psoil'] = FIXED
# Now define the default values
default_par = OrderedDict ()
default_par['n'] = 1.5
default_par['cab'] = 40.
default_par['car'] = 10.
default_par['cbrown'] = 0.01
default_par['cw'] = 0.018 # Say?
default_par['cm'] = 0.0065 # Say?
default_par['lai'] = 2
default_par['ala'] = 45.
default_par['bsoil'] = 1.
default_par['psoil'] = 0.1
parameter_min = OrderedDict()
parameter_max = OrderedDict()
min_vals = [ 0.8, 0.2, 0.0, 0.0, 0.0043, 0.0017,0.01, 40, 0., 0., 0.]
max_vals = [2.5, 77., 5., 1., 0.0753, 0.0331, 8., 50., 2., 1.]
for i, param in enumerate ( state_config.keys() ):
parameter_min[param] = min_vals[i]
parameter_max[param] = max_vals[i]
# Define parameter transformations
transformations = {
'lai': lambda x: np.exp ( -x/2. ), \
'cab': lambda x: np.exp ( -x/100. ), \
'car': lambda x: np.exp ( -x/100. ), \
'cw': lambda x: np.exp ( -50.*x ), \
'cm': lambda x: np.exp ( -100.*x ), \
'ala': lambda x: x/90. }
inv_transformations = {
'lai': lambda x: -2*np.log ( x ), \
'cab': lambda x: -100*np.log ( x ), \
'car': lambda x: -100*np.log( x ), \
'cw': lambda x: (-1/50.)*np.log ( x ), \
'cm': lambda x: (-1/100.)*np.log ( x ), \
'ala': lambda x: 90.*x }
# Define the state grid. In time in this case
state_grid = np.arange ( 1, 366 )
# Define the state
# L'etat, c'est moi
state = State ( state_config, state_grid, default_par, \
parameter_min, parameter_max, verbose=False)
# Set the transformations
state.set_transformations ( transformations, inv_transformations )
The previous code snippet is quite verbose, but you can see that we just set up a large number of parameter defaults, boundaries, transformations and types. We will now use the complete PROSAIL model to simulate the observations over a year. We do this by creating a dictionary where the key is the parameter of interest (in this case lai
), and the value associated is a function of time. In this case, we use the following function for LAI:
$$
LAI(t) = 0.21 + 3.51\cdot\sin(\pi t)^{5}
$$
This trajectory dictionary, as well as the definition of the state is passed to create_parameter_trajectories
that will return an $N_{param}\times N_{dates}$ array with the entire set of parameters for each day of the experiment. For other parameters than LAI, it will be set to the default parameter.
The create_observations
function will create observations using the state definition, the parameter grid and the latitude and longitude values.
In [6]:
from create_emulators import *
trajectories = {
'lai': lambda t: (0.21 + 3.51 * (np.sin(np.pi*t)**5)),
'cab': lambda t: np.where ( t <= 0.5, 5 + 100.*t, \
100 + 5 - 100.*t ),
'cw': lambda t: 0.028 + 0.0150*(np.sin(np.pi*t + 0.1) * \
np.sin(6*np.pi*t + 0.1))
}
# Now, create some simulated data...
parameter_grid = create_parameter_trajectories ( state, trajectories )
plt.plot ( np.arange(1, 366), parameter_grid[6, :], label="LAI" )
plt.xlabel( "DoY" )
plt.ylabel ("LAI [$m^{2}m^{-2}$]")
plt.figure()
plt.plot ( np.arange(1, 366), parameter_grid[1, :], label="Cab" )
plt.xlabel( "DoY" )
plt.ylabel ("Cab")
plt.figure()
plt.plot ( np.arange(1, 366), parameter_grid[4, :], label="Cw" )
plt.xlabel( "DoY" )
plt.ylabel ("Cw")
# Now forward model the observations...
doys, vza, sza, raa, rho = create_observations ( state, parameter_grid, \
42, -8.)
The observations already refer to the MODIS bands. Let's see how they look like:
In [17]:
plt.plot ( doys, rho[1, :], 'o', label="NIR")
plt.plot ( doys, rho[0, :], 's', label="RED")
plt.plot ( doys, rho[4, :], '+', label="SWIR5")
plt.ylabel(r'$\rho$')
plt.xlabel("DoY")
Out[17]:
We see clearly the evolution of the NIR reflectance mapping the development of leaf area, whereas the RED reflectance decreases through increased absorption with high LAI values.
We now need to train the emulators. For the given geometry (vza=15, sza=15, raa=0), we shall use a sampling of parameter space to produce an emulator using the create_emulators
function. We will then save that result for posterity. The following snippet looks for a particular filename: if the file exists in the filesystem (current directory), it will load it up and populate the emulator. If the file doesn't exist, it will use the create_emulators
method to create the emulators, save them to a file, and then reload them into a dictionary indixed by a geometry tuple:
In [7]:
angles = [ [15, 15, 0] ]
for i,(s,v,r) in enumerate(angles):
fname = "%02d_sza_%02d_vza_000_raa" % (s,v)
if os.path.exists ( fname + ".npz"):
emulators = {}
emulators[(v,s)]= MultivariateEmulator ( dump=fname + ".npz" )
else:
emulators, samples, validate = create_emulators ( \
state, [""], angles=angles )
emulators[i].dump_emulator(fname + ".npz")
emulators[(v,s)]= MultivariateEmulator ( dump=fname+".npz" )
The previous code snippet has created the emulators and tested them with a set of random parameters that have used the full model. We see that the reported RMSE is very low, so we will ignore the emulator error in this problem. We have also stored the emulator for this geometry in an .npz
file. We'll load it up and see how it works:
eoldas_ng
needs a number of operators to be defined. We can start by the most obvious and possibly easier to define, the prior. The prior requires a dictionary of mean values and the reciprocal of the variance for each parameter (if the parameters are set to FIXED
in state_config
, the prior will ignore them, but it doesn't hurt to put values in for each parameter).
The prior mean and inverse variances are expressed as transformed values. In this case, we just use the default values from defaul_par
above, and transform or not depending of whether a transformation method has been defined.
In [8]:
mu_prior = OrderedDict ()
prior_inv_cov = OrderedDict ()
prior_inv_cov['n'] = np.array([1])
prior_inv_cov['cab'] = np.array([0.8])
prior_inv_cov['car'] = np.array([1])
prior_inv_cov['cbrown'] = np.array([1])
prior_inv_cov['cw'] = np.array([1])
prior_inv_cov['cm'] = np.array([1])
prior_inv_cov['lai'] = np.array([0.8])
prior_inv_cov['ala'] = np.array([1])
prior_inv_cov['bsoil'] = np.array([2])
prior_inv_cov['psoil'] = np.array([2])
for param in state.parameter_min.iterkeys():
if transformations.has_key ( param ):
mu_prior[param] = transformations[param]( \
np.array([default_par[param]]) )
else:
mu_prior[param] = np.array([default_par[param]])
prior_inv_cov[param] = 1./prior_inv_cov[param]**2
prior = Prior ( mu_prior, prior_inv_cov )
Next comes the regularisation. For this we shall use the TemporalSmoother
class (also used for the identity operator time series example). We set the smoother to only act on LAI, set the value of the regularisation parameter to 5000, and pass the state grid too.
In [9]:
temporal = TemporalSmoother ( state_grid, 10000, required_params=["lai", "cab", "cw"] )
Finally, we define the fit to the observations. To do this, we use the ObservationOperatorTimeSeriesGP
class. The class requires the following inputs:
state_grid
)state
objectNOTE Note that this format might change in the near future as the code is streamlined further!
In [10]:
rho_big = np.zeros(( 7,365))
mask = np.zeros(( 365, 4))
for i in state_grid:
if i in doys:
rho_big[:, i] = rho[:, doys==i].squeeze()
mask[ i, :] = [ 1, vza[doys==i], sza[doys==i], raa[doys==i] ]
bu = np.ones(7)*0.01
obs = ObservationOperatorTimeSeriesGP( state_grid, state, rho_big, mask, \
emulators, bu, band_pass, bw )
When the operators have been defined, we need to add them to the state using the add_operator
method, and giving them a name for logging and information purposes:
In [11]:
state.add_operator ( "Prior", prior )
state.add_operator ( "Model", temporal )
state.add_operator ( "Obs", obs )
In [12]:
x_dict = {}
for i,k in enumerate(state.parameter_max.keys()):
if state.state_config[k] == VARIABLE:
x_dict[k] = ( parameter_grid[i, :] )
else:
x_dict[k] = parameter_grid[i, 0]*0+ mu_prior[k]
ndvi = ( rho[1, :] - rho[0, :] )/( rho[1, :] + rho[0, :] )
lai = 4.2 * np.power ( ndvi, 2.138 )
laii = np.interp(np.arange(1, 366), doys, lai )
x_dict['lai'] = laii#np.random.rand(365)*5.#mu_prior['lai']
x_dict['cab'] = np.random.rand(365)*70.
x_dict['cw'] = np.random.rand(365)*0.08
#x_dict['cw'] = np.ones(365)*mu_prior['cw']
retval = state.optimize ( x_dict )
In [50]:
[ plt.axvline(x=i, color="0.8") for i in doys ]
plt.plot ( state_grid, retval[0]['lai'] , lw=2.5, label="DA solution")
plt.plot ( state_grid, x_dict['lai'], 'x', label="Initial guess" )
plt.plot ( state_grid, parameter_grid[6, :] , lw=2.5, label="Truth")
plt.legend(loc='best', fancybox=True, shadow=True )
plt.xlabel("DoY")
plt.ylabel("LAI")
plt.figure()
[ plt.axvline(x=i, color="0.8") for i in doys ]
plt.plot ( state_grid, retval[0]['cab'] , lw=2.5, label="DA solution")
plt.plot ( state_grid, x_dict['cab'], 'x', label="Initial guess" )
plt.plot ( state_grid, parameter_grid[1, :] , lw=2.5, label="Truth")
plt.legend(loc='best', fancybox=True, shadow=True )
plt.xlabel("DoY")
plt.ylabel("Cab")
plt.figure()
[ plt.axvline(x=i, color="0.8") for i in doys ]
plt.plot ( state_grid, retval[0]['cw'] , lw=2.5, label="DA solution")
plt.plot ( state_grid, x_dict['cw'], 'x', label="Initial guess" )
plt.plot ( state_grid, parameter_grid[4, :] , lw=2.5, label="Truth")
plt.legend(loc='best', fancybox=True, shadow=True )
plt.xlabel("DoY")
plt.ylabel("Cw")
Out[50]:
We see in the previous plot how the retrieval of LAI is quite good, $C_{ab}$ a bit more wobbly, and $C_w$ is actually good in the centre of the time period, coinciding with a high LAI. Clearly, we need to actually calculate the uncertainty to see the actual performance of the retrievals. This is typically achieved by calculating the Hessian matrix. However, given that the cost function value is quite low (~350), we can approximate that by the Jacobian, available from state.cost
:
In [51]:
x0 = retval[0]
xsol = state.pack_from_dict ( x0, do_transform=True )
xsol = np.array ( xsol )
In [52]:
def hessian_approx ( xsol, epsilon=1e-10):
verbose_state = state.verbose
state.verbose_state = False
N = xsol.size
h = np.zeros((N,N))
df0 = state.cost( xsol )[1]
for i in xrange( N ):
print i, ", ",
if i % 20 == 0:
print
xx0 = 1.*xsol[i]
xsol[i] = xx0 + epsilon
df = state.cost( xsol )[1]
h[i,:] = (df - df0)/epsilon
xsol[i] = xx0
post_cov = np.linalg.inv (h)
post_sigma = np.sqrt ( post_cov.diagonal() )
state.verbose = verbose_state
return h, post_cov, post_sigma
h, post_cov, post_sigma = hessian_approx ( xsol )
In [53]:
plt.fill_between ( state_grid, -2*np.log(xsol[365*2:] - 1.96*post_sigma[365*2:]), \
-2*np.log(xsol[365*2:] + 1.96*post_sigma[365*2:] ), facecolor="0.8" )
plt.plot ( state_grid, parameter_grid[6, :] , '-', lw=2.5, label="Truth")
plt.plot ( state_grid, -2*np.log(xsol[365*2:]), '-', lw=2.5)
plt.figure()
plt.fill_between ( state_grid, -100*np.log(xsol[:365] - \
1.96*post_sigma[:365]), -100*np.log(xsol[:365] + 1.96*post_sigma[:365] ), facecolor="0.8" )
plt.plot ( state_grid, parameter_grid[1, :] , lw=2.5, label="Truth")
plt.plot ( state_grid, -100*np.log(xsol[:365]), '-', lw=2.5)
plt.figure()
plt.fill_between ( state_grid, -1./50*np.log(xsol[365:(365*2)] - 1.96*post_sigma[365:(365*2)]), \
-1./50*np.log(xsol[365:(365*2)] + 1.96*post_sigma[365:(365*2)] ), facecolor="0.8" )
plt.plot ( state_grid, -1/50.*np.log(xsol[365:(365*2)]), '-', lw=2.5)
plt.plot ( state_grid, parameter_grid[4, :] , lw=2.5, label="Truth")
Out[53]:
In [19]:
np.savez ( "time_series_result.npz", xsol=xsol, post_sigma=post_sigma, \
hessian=h, post_cov=post_cov )
In [23]:
np.savez ( "observations.npz", rho=rho_big, mask=mask, parameters=parameter_grid )
In [ ]: