In [3]:
%pylab inline
In [22]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
When we're having measurement data, we can verify the model behaviour and confront the (uncertain) measurement data with the (uncertain) model parameters and (uncertain) model structure.
This is directly related to Inverse modelling, Model Calibration, Parameter Estimation and Bayesian methods for model optimization, but this is out of the scope of the lesson today.
We will focus only on one specific method, called Generalised Likelihood Uncertainty Estimation (GLUE). There is a lot of scientific debate in literature about the statistical correctness of the method, but we will apply the method here as a model diagnosis tool, without stating that the final prediction uncertainty of the model is thé correct one. Hence, in uncertainty modelling, there is always a part of epistemic uncertainty...
The basic principle of the GLUE approach is the recognition that, when using potentially overparameterized models, different parameter sets can have similar behaviour in terms of model performance.
So, rather than one single global optimal parameter set, there may be many different parameter sets that produce acceptable simulations, especially if the errors in both model input and observations are taken into account. Indeed, when large errors in the observations are not solely resulting from noise in the measure- ments and parameters are fitted by minimizing the error between the model output and observations, the risk of accepting non-realistic parameter sets arises. Moreover, appropriate parameter combinations can potentially be missed, comparable with a statistical type II error, where a 'hit' is disregarded and seen as a 'miss'. The GLUE approach accounts for this by accepting all model simulations (and corresponding parameter sets) with sufficient performance relative to the measurement errors, instead of looking to the overall best fit.
The major steps of the GLUE method are:
Decision about the evaluation function to be used in the evaluation of the different model simulations in combination with a rejection criterion (threshold) to identify non-behavioural model outputs. The evaluation function (called likelihood) can be both informal like common objective functions such as Sum of Squared Errors (SSE) measures or formal measures using an explicit representation of model and measurement error, although the latter is mostly hard to identify due to the unknown combinations of uncertainties and errors involved. However, when more specific information is available about the measurement and input errors, this could be included in the expression of the likelihood function. As, the likelihood function can be described as a formal likelihood function incorporating the error characteristics (e.g. non-normal, heteroscedastic or autocorrelated model errors). Ideally, the rejection criterion should be chosen before starting the simulations based on the possible observation errors, but in practice the definition of this criterion is mainly a learning process during the nalysis itself.
Selection of model parameters and input variables (or boundary conditions) to consider uncertain inputs and defining the prior distributions of these uncertain inputs. The Monte Carlo runs will be sampled randomly from these distributions and, as such, these must reject the prior knowledge of the parameters. When only little prior information is available, a non-informative uniform distribution is typically selected a priori. The range of the distribution is based on expert knowledge or reported values in literature.
Running a sufficient set of model simulations using a Monte Carlo-based technique, sampling from the prior distributions. Latin Hypercube Sampling (LHS) can be used in order to ensure a representative sampling of the parameter space at a lower number of samples than random sampling. The outcomes of the different runs are compared to the measured values and for each parameter set the corresponding likelihood function value (defined in step 1) is calculated.
The parameter sets with insufficient behaviour (objective function values below the agreed threshold) are considered non-behavioural and excluded from the subsequent analysis by attributing them a zero likelihood. Applying this threshold is a crucial step in the analysis, since it is directly related to the prediction uncertainty. In the ideal situation of exactly one single global optimal parameter set, a threshold would be sought resulting in a likelihood value of the optimal set of one and zero for all the other parameter sets.
The obtained likelihood values of the behavioural model outputs are normalised. To determine model prediction uncertainty, the model outputs are ranked at every time step and the normalised likelihood values are used to construct the cumulative distribution for the output variable. Prediction uncertainty is subsequently determined by selecting the appropriate percentiles (e.g. 5% and 95%) from the Cumulative Distribution Function (CDF) at that time step.
The GLUE method is similar with part 1 (the propagation), but uses data to evaluate if the simulations coming from the Monte Carlo runs are relevant, taking into account uncertainty of data and parameters.
In [2]:
from IPython.display import Image
Image(filename='glue.png')
Out[2]:
The GLUE method is also available in the pySTAN package and other environments. However, today we will give an introduction and script this ourselve to better understand the underlying idea.
In [5]:
from definitions_exercises import RespiroModel #Load in the model
Now, we have data to analyse these model outcomes:
So, we read in these data:
In [6]:
#first a definition to transform our data from minutes to the
#days we're working in in the model
def min2d(TSin):
return TSin/(60*24)
f=np.loadtxt('OUR_exp_3f.txt', skiprows=2) #command to read the data from the textfile
Mtime_min=f[:,0]
Mtime_data=min2d(Mtime_min)
MDO=f[:,1]
MOURex=f[:,2] #DATA of interest!
In [7]:
#visualize the data
fig,ax = plt.subplots(1,2, figsize=(10,6))
ax[0].plot(Mtime_data,MDO,'k'),ax[0].set_title('DO')
ax[1].plot(Mtime_data,MOURex,'k'),ax[1].set_title('OUR')
Out[7]:
So, we start with a Monte Carlo run (or Latin Hypercube), sampling our parameter space, just as in part 1
In [8]:
from definitions_exercises import lhs_uniform #(nsamples,pmin,pmax)
In [9]:
Mtime_d = Mtime_data #We use the same outputs for ourt model as the data is!
#run Monte Carlo for OUR with Latin Hypercube
##################################################
MCruns = 3000
#We save the output in an array with in the rows the different outputs
#and in the columns the timesteps
MCoutputs = np.zeros((MCruns, Mtime_d.size))
#We have to sample the distributions before we start the Monte-Carlo for loop:
X0 = lhs_uniform(MCruns, 650.,700.)
mumax = lhs_uniform(MCruns,3.5,4.5)
Y = lhs_uniform(MCruns,0.77,0.81)
Ks = lhs_uniform(MCruns,0.3,0.5)
tau = lhs_uniform(MCruns,0.00001,0.001)
for i in range(MCruns):
#Run the model for the sampled parameters
Init_unc=np.array([X0[i]])
Pars=[mumax[i],Y[i],Ks[i],tau[i]]
modeloutput = RespiroModel(Pars,Init_unc,Mtime_d)
#Save only the OUR output of the model to analyze
MCoutputs[i,:]=modeloutput[4]
So, we can now compare all the model simulations from the Monte Carlo with this measured output:
In [10]:
for i in range(100): #plot first 100
plt.plot(modeloutput[0],MCoutputs[i,:])
plot(Mtime_data,MOURex,'gray', linewidth=3)
Out[10]:
Comparing the data with the model output can be done with many evaluation (objective) functions. Since we're working in a (Bayesian) statistics environment, these are called Likelihoods. In 'real' Bayesian statistics, this likelihood function is a statistical model describing the error-structure (out of scope for today).
However, to apply the GLUE method, the requirements are less strict and the well-known Nash-Sutcliff (NS) function can also be used:
Basically, best model fits need higher values going to 1 and lower values would go to zero. With th Nash-sutcliffe, values lower than 0 would mean the model is worser than a model that woul represent the mean value of the data, so is considered unbehaviroual to describe the model behaviour.
In [11]:
def Nash_Sutcliff(data, model):
'''
Function for calculating the Nash Sutcliffe objective function
'''
residuals = data-model
DMS = np.sum(residuals**2)
MMS = (np.sum((data-np.mean(model))**2))
Like=max(1.0-(DMS)/(MMS),0.0)
return Like
#Test this for 1 output:
output2use=0
print 'NS is ',Nash_Sutcliff(MOURex, MCoutputs[output2use,:])
plt.plot(Mtime_d,MCoutputs[output2use,:])
plt.plot(Mtime_d,MOURex)
Out[11]:
Calculate the likelihood for every run:
In [12]:
Likels = np.zeros((MCruns,1))
for i in range(MCruns):
Likels[i] = Nash_Sutcliff(MOURex, MCoutputs[i,:])
#Let's find the best one
idmax = np.argmax(Likels)
One way of representing the output of the GLUE analysis, is to show the Likelihood of the analysis for 1 parameter in a scatterplot, in the methodology called 'dotty-plots'.
I'm also adding here the 'best' fit
In [13]:
par2use = mumax
plt.scatter(par2use,Likels, color='k')
plt.scatter(par2use[idmax],Likels[idmax],color='r')
Out[13]:
Dotty plots are very useful for the evaluation of the identification of parameters.
Another way of checking the outcome is a two-dimensional plot, I prepared a script for to make this more visual:
In [14]:
from definitions_exercises import Scatter_hist
When we decided about the cut-off, we have to keep the 'behavioural' parameter sets and create an output graph of the outputs we selected
Let's set a threshold and check the parameter behaviour
In [15]:
#First stack all the parameters in one array:
all_pars = np.vstack((X0,mumax,Y,Ks,tau)).transpose()
all_pars.shape
Out[15]:
We can change the threshold with different values and see how the selected parameters behave:
In [23]:
from matplotlib import cm
threshold = 0.98
#select subset
behavpars = all_pars[np.where(Likels>threshold)[0]]
behavLik = Likels[np.where(Likels>threshold)]
bwx=3
bwy=0.03
x=np.arange(10)
y=np.sin(x)
Scatter_hist(behavpars[:,0], behavpars[:,1], xbinwidth = bwx,
ybinwidth=bwy,cleanstyle = True)
Out[23]:
So, for higher values of the likelihood, we see a relationship between the parameter $\mu_{max}$ and our initial condition $X0$. We can visualize this better with some colors to explain this behaviour:
In [17]:
colormapss=cm.RdYlGn_r
threshold = 0.98
#select subset
behavpars = all_pars[np.where(Likels>threshold)[0]]
behavLik = Likels[np.where(Likels>threshold)]
fig,axScatter,axHistx,axHisty = Scatter_hist(behavpars[:,0], behavpars[:,1],
data1b=all_pars[:,0], data2b=all_pars[:,1],
xbinwidth = bwx, ybinwidth=bwy,
cleanstyle = True, s=25, marker='o',
SSE=1./behavLik, SSEb=1./Likels,
vmax=1./threshold+(1./threshold)*0.1,
colormaps= colormapss, roodlichter=1.0)
axScatter.set_ylabel(r'$\mu_{max}$',fontsize=20)
axScatter.set_xlabel(r'$X_0$',fontsize=20)
Out[17]:
Another way of visualizing this is in 3D (without code, since computational too intensive, done with pure SSE):
In [4]:
Image(filename='goot.png')
Out[4]:
We can make some important issues about this correlation:
So, once we selected the parameters sets that we trust, based on the data, we can actually generate posterior bands of these, just as we did in the first part with error propagation. However, in the GLUE method, the selected outcomes are weighted, so better fits are having more weight.
In [18]:
#behavpars, behavLik we already hav, so now we do this for the outputs
behavMCoutputs = MCoutputs[np.where(Likels>threshold)[0]]
#We normalize the Likelihoods, in order to make the sum 1
#(representing a probability distribution)
Likl_norm=behavLik/behavLik.sum()
Since we want to weight the outputs based on the Likelihood values, we have to adapt the percentile calculation, this is implemented in a separate function:
In [19]:
from definitions_exercises import UncertOut
[lowerb,upperb] = UncertOut(Likl_norm,behavMCoutputs, qval=0.025)
In [20]:
#So, what have we basically done, if we would do it for 1 timestep?
fig,ax = plt.subplots()
#the likelihood as weight:
ax.hist(behavMCoutputs[:,500], weights=Likl_norm, normed=True, color='k')
Out[20]:
So, we have our uncertainty boundaries an can plot them, similar to part 1:
In [25]:
plt.plot(Mtime_d, lowerb,'k--')
plt.plot(Mtime_d, upperb,'k--')
plt.fill_between(Mtime_d, lowerb, upperb, color='gray')
plt.plot(Mtime_d,MOURex,'r-')
Out[25]:
The GLUE method evaluates the uncertain model based on available data to get more information about the model behaviour.
Advantages:
Disadvantages:
So what's more?
Further reading:
Beven, Keith. Environmental Modelling: An Uncertain Future? Taylor & Francis, 2008.
Vrugt, Jasper A., Cajo ter Braak, H. V. Gupta, and Bruce Robinson. “Equifinality of Formal (DREAM) and Informal (GLUE) Bayesian Approaches in Hydrologic Modeling?” Stochastic Environmental Research and Risk Assessment 23, no. 7 (2009): 1011–1026. http://dx.doi.org/10.1007/s00477-008-0274-y.