Grid Search - Colonster - 4 free parameters

Colonster site, MLE with 4 free parameters (grid search method).

For more info about the method used, see the notebook Inference_Notes.

This notebook has the following external dependencies:


In [1]:
import math
import csv

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
import seaborn as sns
import yaml

%matplotlib inline

The mathematical (Predictive) Model

The mathematical model is available in the models Python module (see the notebook Models)


In [2]:
import models

The Data


In [3]:
profile_data = pd.read_csv('profiles_data/colonster_10Be_profile_data.csv',
                           index_col='sample',
                           delim_whitespace=True,
                           quoting=csv.QUOTE_NONNUMERIC, quotechar='\"',
                           dtype={'depth': 'f', 'depth_g-cm-2': 'f',
                                  'C': 'f', 'std': 'f'})

profile_data


Out[3]:
depth depth_g-cm-2 C std nuclide
sample
s01 450 886 118424 6928 10Be
s02 400 788 81698 5991 10Be
s03 350 689 133908 4949 10Be
s04 300 591 133243 8756 10Be
s06 200 394 255119 9940 10Be
s07 175 345 333152 11792 10Be
s08 150 295 387154 14811 10Be
s09 125 246 436636 17066 10Be
s10 100 197 710515 24670 10Be

9 rows × 5 columns


In [4]:
with open('profiles_data/colonster_10Be_settings.yaml') as f:
    colonster_settings = yaml.load(f)

colonster_settings


Out[4]:
{'P_0': 5.21, 'altitude': 134.0, 'latitude': 50.58, 'pressure': 997.256}

The dataset is stored as a :class:pandas.DataFrame object.

Fitting the model

The grid search method is implemented in the gridsearch module (see the notebook Grid-Search for more info).


In [5]:
import gridsearch

Create a new object for setup and results


In [6]:
gstest = gridsearch.CosmogenicInferenceGC(description='Colonster 4 parameters')

Set the data


In [7]:
gstest.set_profile_measured(
    profile_data['depth'].values,
    profile_data['C'].values,
    profile_data['std'].values,
    None,
)

Set the model


In [8]:
def C_10Be_colonster(depth, erosion, exposure,
                      density, inheritance):
    """
    10Be colonster
    """
    return models.C_10Be(depth, erosion, exposure,
                         density, inheritance,
                         P_0=colonster_settings['P_0'])

gstest.set_profile_model(C_10Be_colonster)

Define the parameters to fit and their search ranges / steps. The order must be the same than the order of the arguments of the function used for the model!


In [9]:
gstest.set_parameter(
    'erosion_rate',
    [0., 0.5e-3, 70j],
    stats.uniform(loc=0, scale=0.5e-3).pdf
)

gstest.set_parameter(
    'exposure_time',
    [0., 3e6, 130j],
    stats.uniform(loc=0, scale=3e6).pdf
)

gstest.set_parameter(
    'soil_density',
    [1.6, 2.6, 70j],
    stats.uniform(loc=1.6, scale=2.6).pdf
)

gstest.set_parameter(
    'inheritance',
    [0, 1e5, 60j],
    stats.uniform(loc=0, scale=1e5).pdf
)

Grid search setup summary


In [10]:
print gstest.setup_summary()


Modelling C profile (Bayes, Grid-Search)

DESCRIPTION:
Colonster 4 parameters

MEASURED PROFILE (9 samples):
        C  depth nuclide    std
0  118424    450    None   6928
1   81698    400    None   5991
2  133908    350    None   4949
3  133243    300    None   8756
4  255119    200    None   9940
5  333152    175    None  11792
6  387154    150    None  14811
7  436636    125    None  17066
8  710515    100    None  24670

[9 rows x 4 columns]

PROFILE MODEL:
C_10Be_colonster
10Be colonster

'UNKNOWN' PARAMETERS (4):
erosion_rate:
	prior: <bound method rv_frozen.pdf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7f152300f6d0>>
	range: [0.0, 0.0005, 70j]
exposure_time:
	prior: <bound method rv_frozen.pdf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7f152300f690>>
	range: [0.0, 3000000.0, 130j]
soil_density:
	prior: <bound method rv_frozen.pdf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7f152300f850>>
	range: [1.6, 2.6, 70j]
inheritance:
	prior: <bound method rv_frozen.pdf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7f152300f890>>
	range: [0, 100000.0, 60j]

degrees of freedom: 5

GRID SEARCH:
nb. of nodes per parameter: [70, 130, 70, 60]
total nb. of nodes: 38220000


Perform Maximum likelihood estimation on the search grid


In [11]:
gstest.compute_mle()

Get the MLE (i.e., the parameter values at the maximum likelihood), in the same order than the definition of the parameters


In [12]:
gstest.mle


Out[12]:
[array([ 0.00015217]),
 array([ 1534883.72093023]),
 array([ 2.12173913]),
 array([ 62711.86440678])]

Plot the profile log-likelihood for each parameter. The blue lines represent the difference between the profile log-likelihood and the maximum log-likelihood, The intersections between the blue line and the black lines define the confidence intervals at the given confidence levels (based on the likelihood ratio test). The red lines indicate the true values.


In [13]:
%matplotlib inline

def plot_proflike1d(cobj, pname, clevels=[0.68, 0.95, 0.997],
                    true_val=None, ax=None):
    
    p = cobj.parameters[pname]
    pindex = cobj.parameters.keys().index(pname)
    
    x = cobj.grid[pindex].flatten()
    proflike = cobj.proflike1d[pindex]

    if ax is None:
        ax = plt.subplot(111)
    
    difflike = proflike - cobj.maxlike
    
    ax.plot(x, difflike, label='profile loglike')
    
    ccrit = gridsearch.profile_likelihood_crit(
        cobj.proflike1d[pindex],
        cobj.maxlike,
        clevels=clevels
    )
    ccrit -= cobj.maxlike
    
    for lev, cc in zip(clevels, ccrit):
        l = ax.axhline(cc, color='k')
        hpos = x.min() + (x.max() + x.min()) * 0.05
        ax.text(hpos, cc, str(lev * 100),
                size=9, color = l.get_color(),
                ha="center", va="center",
                bbox=dict(ec='1',fc='1'))
    
    if true_val is not None:
        ax.axvline(true_val, color='r')
    
    plt.setp(ax, xlabel=pname,
             ylabel='profile log-like - max log-like',
             xlim=p['range'][0:2],
             ylim=[ccrit[-1], 0.])
    

def plot_proflike1d_all(cobj, n_subplot_cols=2, **kwargs):
    
    n_subplots = len(cobj.parameters)
    n_subplot_rows = int(math.ceil(1. * 
                                   n_subplots /
                                   n_subplot_cols))
    
    fig, aax = plt.subplots(nrows=n_subplot_rows,
                            ncols=n_subplot_cols,
                            **kwargs)
    axes = aax.flatten()
    fig.text(0.5, 0.975,
             "Profile log-like: " + cobj.description,
             horizontalalignment='center',
             verticalalignment='top')
    
    for i, pname in enumerate(cobj.parameters.keys()):
        ax = axes[i]
        plot_proflike1d(cobj, pname,
                        ax=ax)
    
    plt.tight_layout()
    plt.subplots_adjust(top=0.93)


plot_proflike1d_all(gstest, figsize=(11, 6))


Show the profile log-likelihood for couples of parameters. Confidence regions are also shown (also based on the likelihood ratio test).


In [25]:
def plot_proflike2d(cobj, p1p2, ax=None,
                    cmap='Blues', show_colorbar=True):
    
    pname1, pname2 = p1p2
    idim = cobj.parameters.keys().index(pname2)
    jdim = cobj.parameters.keys().index(pname1)
    
    if ax is None:
        ax = plt.subplot(111)
    
    X, Y = np.meshgrid(cobj.grid[idim].flatten(),
                       cobj.grid[jdim].flatten())
    
    difflike = cobj.proflike2d[idim][jdim] - cobj.maxlike
    
    ccrit = gridsearch.profile_likelihood_crit(
        cobj.proflike2d[idim][jdim],
        cobj.maxlike,
        clevels=[0.68, 0.95]
    )
    ccrit -= cobj.maxlike
    
    contours = np.linspace(np.median(difflike),
                           0,
                           10)
    
    P2D = ax.contourf(Y, X, difflike,
                      contours,
                      cmap=plt.get_cmap(cmap))
    
    ci68 = ax.contour(Y, X, difflike,
                      [ccrit[0]], colors='w',
                      linestyles='solid')
    plt.clabel(ci68, fontsize=8, inline=True,
               fmt='68')
    ci95 = ax.contour(Y, X, difflike,
                      [ccrit[1]], colors=['k'],
                      linestyles='solid')
    plt.clabel(ci95, fontsize=8, inline=True,
               fmt='95')
    
    ax.scatter(cobj.mle[jdim], cobj.mle[idim],
               marker='*', s=60, c='w')
    
    plt.setp(ax, xlabel=pname1, ylabel=pname2)
    
    if show_colorbar:
        plt.colorbar(P2D, ax=ax)
    
    #ax.axhline(true_exposure, color='r')
    #ax.axvline(true_erosion, color='r')
    

fig = plt.figure(figsize=(11, 9))
ax = plt.subplot(321)
plot_proflike2d(gstest, ('erosion_rate', 'exposure_time'), ax=ax)
ax2 = plt.subplot(322)
plot_proflike2d(gstest, ('exposure_time', 'inheritance'), ax=ax2)
ax3 = plt.subplot(323)
plot_proflike2d(gstest, ('erosion_rate', 'inheritance'), ax=ax3)
ax4 = plt.subplot(324)
plot_proflike2d(gstest, ('exposure_time', 'soil_density'), ax=ax4)
ax5 = plt.subplot(325)
plot_proflike2d(gstest, ('erosion_rate', 'soil_density'), ax=ax5)
ax6 = plt.subplot(326)
plot_proflike2d(gstest, ('soil_density', 'inheritance'), ax=ax6)
plt.tight_layout()


Plot the measured concentrations and the predicted profile corresponding to the best fitted data model


In [15]:
sns.set_context('notebook')

depths = np.linspace(profile_data['depth'].min(),
                     profile_data['depth'].max(),
                     100)
Cm_fitted = C_10Be_colonster(depths, *gstest.mle)

plt.figure()
plt.plot(Cm_fitted, -depths, label='best-fitted model')
plt.errorbar(profile_data['C'],
             -profile_data['depth'],
             xerr=profile_data['std'],
             fmt='o', markersize=4,
             label='data')
plt.setp(plt.gca(),
         xlabel='10Be concentration [atoms g-1]',
         ylabel='-1 * depth [cm]',
         xlim=[0, None], ylim=[None, 0])

plt.legend(loc='lower right')


Out[15]:
<matplotlib.legend.Legend at 0x7f1522d4f950>

Observations


In [15]: