Grid Search - Lodomez - 4 free parameters

Lodomez site, MLE with 4 free parameters (grid search method).

For more info about the method used, see the notebook Inference_Notes.

This notebook has the following external dependencies:


In [1]:
import math
import csv

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
import seaborn as sns
import yaml

%matplotlib inline

The mathematical (Predictive) Model

The mathematical model is available in the models Python module (see the notebook Models)


In [2]:
import models

The Data


In [3]:
profile_data = pd.read_csv('profiles_data/lodomez_10Be_profile_data.csv',
                           index_col='sample',
                           delim_whitespace=True,
                           quoting=csv.QUOTE_NONNUMERIC, quotechar='\"',
                           dtype={'depth': 'f', 'depth_g-cm-2': 'f',
                                  'C': 'f', 'std': 'f'})

profile_data


Out[3]:
depth depth_g-cm-2 C std nuclide
sample
s01 250 451 43005 1695 10Be
s02 200 361 94800 2024 10Be
s03 165 298 148569 3621 10Be
s04 100 181 269566 5038 10Be
s05 50 90 432800 11714 10Be

5 rows × 5 columns


In [4]:
with open('profiles_data/lodomez_10Be_settings.yaml') as f:
    lodomez_settings = yaml.load(f)

lodomez_settings


Out[4]:
{'P_0': 6.13, 'altitude': 283.0, 'latitude': 50.39, 'pressure': 979.711}

The dataset is stored as a :class:pandas.DataFrame object.

Fitting the model

The grid search method is implemented in the gridsearch module (see the notebook Grid-Search for more info).


In [5]:
import gridsearch

Create a new object for setup and results


In [6]:
gstest = gridsearch.CosmogenicInferenceGC(description='Lodomez 4 parameters')

Set the data


In [7]:
gstest.set_profile_measured(
    profile_data['depth'].values,
    profile_data['C'].values,
    profile_data['std'].values,
    None,
)

Set the model


In [8]:
def C_10Be_lodomez(depth, erosion, exposure,
                      density, inheritance):
    """
    10Be lodomez
    """
    return models.C_10Be(depth, erosion, exposure,
                         density, inheritance,
                         P_0=lodomez_settings['P_0'])

gstest.set_profile_model(C_10Be_lodomez)

Define the parameters to fit and their search ranges / steps. The order must be the same than the order of the arguments of the function used for the model!


In [9]:
gstest.set_parameter(
    'erosion_rate',
    [0., 0.4e-3, 70j],
    stats.uniform(loc=0, scale=0.4e-3).pdf
)

gstest.set_parameter(
    'exposure_time',
    [1e5, 2e5, 100j],
    stats.uniform(loc=1e5, scale=2e5).pdf
)

gstest.set_parameter(
    'soil_density',
    [1.8, 2.1, 70j],
    stats.uniform(loc=1.8, scale=2.1).pdf
)

gstest.set_parameter(
    'inheritance',
    [0, 5e3, 60j],
    stats.uniform(loc=0, scale=5e3).pdf
)

Grid search setup summary


In [10]:
print gstest.setup_summary()


Modelling C profile (Bayes, Grid-Search)

DESCRIPTION:
Lodomez 4 parameters

MEASURED PROFILE (5 samples):
        C  depth nuclide    std
0   43005    250    None   1695
1   94800    200    None   2024
2  148569    165    None   3621
3  269566    100    None   5038
4  432800     50    None  11714

[5 rows x 4 columns]

PROFILE MODEL:
C_10Be_lodomez
10Be lodomez

'UNKNOWN' PARAMETERS (4):
erosion_rate:
	prior: <bound method rv_frozen.pdf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7f4b6714c4d0>>
	range: [0.0, 0.0004, 70j]
exposure_time:
	prior: <bound method rv_frozen.pdf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7f4b6714c490>>
	range: [100000.0, 200000.0, 100j]
soil_density:
	prior: <bound method rv_frozen.pdf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7f4b6714c650>>
	range: [1.8, 2.1, 70j]
inheritance:
	prior: <bound method rv_frozen.pdf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7f4b6714c690>>
	range: [0, 5000.0, 60j]

degrees of freedom: 1

GRID SEARCH:
nb. of nodes per parameter: [70, 100, 70, 60]
total nb. of nodes: 29400000


Perform Maximum likelihood estimation on the search grid


In [11]:
gstest.compute_mle()

Get the MLE (i.e., the parameter values at the maximum likelihood), in the same order than the definition of the parameters


In [12]:
gstest.mle


Out[12]:
[array([ 0.]), array([ 140404.04040404]), array([ 1.90434783]), array([ 0.])]

Plot the profile log-likelihood for each parameter. The blue lines represent the difference between the profile log-likelihood and the maximum log-likelihood, The intersections between the blue line and the black lines define the confidence intervals at the given confidence levels (based on the likelihood ratio test). The red lines indicate the true values.


In [13]:
%matplotlib inline

def plot_proflike1d(cobj, pname, clevels=[0.68, 0.95, 0.997],
                    true_val=None, ax=None):
    
    p = cobj.parameters[pname]
    pindex = cobj.parameters.keys().index(pname)
    
    x = cobj.grid[pindex].flatten()
    proflike = cobj.proflike1d[pindex]

    if ax is None:
        ax = plt.subplot(111)
    
    difflike = proflike - cobj.maxlike
    
    ax.plot(x, difflike, label='profile loglike')
    
    ccrit = gridsearch.profile_likelihood_crit(
        cobj.proflike1d[pindex],
        cobj.maxlike,
        clevels=clevels
    )
    ccrit -= cobj.maxlike
    
    for lev, cc in zip(clevels, ccrit):
        l = ax.axhline(cc, color='k')
        hpos = x.min() + (x.max() + x.min()) * 0.05
        ax.text(hpos, cc, str(lev * 100),
                size=9, color = l.get_color(),
                ha="center", va="center",
                bbox=dict(ec='1',fc='1'))
    
    if true_val is not None:
        ax.axvline(true_val, color='r')
    
    plt.setp(ax, xlabel=pname,
             ylabel='profile log-like - max log-like',
             xlim=p['range'][0:2],
             ylim=[ccrit[-1], 0.])
    

def plot_proflike1d_all(cobj, n_subplot_cols=2, **kwargs):
    
    n_subplots = len(cobj.parameters)
    n_subplot_rows = int(math.ceil(1. * 
                                   n_subplots /
                                   n_subplot_cols))
    
    fig, aax = plt.subplots(nrows=n_subplot_rows,
                            ncols=n_subplot_cols,
                            **kwargs)
    axes = aax.flatten()
    fig.text(0.5, 0.975,
             "Profile log-like: " + cobj.description,
             horizontalalignment='center',
             verticalalignment='top')
    
    for i, pname in enumerate(cobj.parameters.keys()):
        ax = axes[i]
        plot_proflike1d(cobj, pname,
                        ax=ax)
    
    plt.tight_layout()
    plt.subplots_adjust(top=0.93)


plot_proflike1d_all(gstest, figsize=(11, 6))


Show the profile log-likelihood for couples of parameters. Confidence regions are also shown (also based on the likelihood ratio test).


In [14]:
def plot_proflike2d(cobj, p1p2, ax=None,
                    cmap='Blues', show_colorbar=True):
    
    pname1, pname2 = p1p2
    idim = cobj.parameters.keys().index(pname2)
    jdim = cobj.parameters.keys().index(pname1)
    
    if ax is None:
        ax = plt.subplot(111)
    
    X, Y = np.meshgrid(cobj.grid[idim].flatten(),
                       cobj.grid[jdim].flatten())
    
    difflike = cobj.proflike2d[idim][jdim] - cobj.maxlike
    
    ccrit = gridsearch.profile_likelihood_crit(
        cobj.proflike2d[idim][jdim],
        cobj.maxlike,
        clevels=[0.68, 0.95]
    )
    ccrit -= cobj.maxlike
    
    contours = np.linspace(np.median(difflike),
                           0,
                           10)
    
    P2D = ax.contourf(Y, X, difflike,
                      contours,
                      cmap=plt.get_cmap(cmap))
    
    ci68 = ax.contour(Y, X, difflike,
                      [ccrit[0]], colors='w',
                      linestyles='solid')
    plt.clabel(ci68, fontsize=8, inline=True,
               fmt='68')
    ci95 = ax.contour(Y, X, difflike,
                      [ccrit[1]], colors=['k'],
                      linestyles='solid')
    plt.clabel(ci95, fontsize=8, inline=True,
               fmt='95')
    
    ax.scatter(cobj.mle[jdim], cobj.mle[idim],
               marker='*', s=60, c='w')
    
    plt.setp(ax, xlabel=pname1, ylabel=pname2)
    
    if show_colorbar:
        plt.colorbar(P2D, ax=ax)
    
    #ax.axhline(true_exposure, color='r')
    #ax.axvline(true_erosion, color='r')
    

fig = plt.figure(figsize=(11, 9))
ax = plt.subplot(321)
plot_proflike2d(gstest, ('erosion_rate', 'exposure_time'), ax=ax)
ax2 = plt.subplot(322)
plot_proflike2d(gstest, ('exposure_time', 'inheritance'), ax=ax2)
ax3 = plt.subplot(323)
plot_proflike2d(gstest, ('erosion_rate', 'inheritance'), ax=ax3)
ax4 = plt.subplot(324)
plot_proflike2d(gstest, ('exposure_time', 'soil_density'), ax=ax4)
ax5 = plt.subplot(325)
plot_proflike2d(gstest, ('erosion_rate', 'soil_density'), ax=ax5)
ax6 = plt.subplot(326)
plot_proflike2d(gstest, ('soil_density', 'inheritance'), ax=ax6)
plt.tight_layout()


Plot the measured concentrations and the predicted profile corresponding to the best fitted data model


In [15]:
sns.set_context('notebook')

depths = np.linspace(profile_data['depth'].min(),
                     profile_data['depth'].max(),
                     100)
Cm_fitted = C_10Be_lodomez(depths, *gstest.mle)

plt.figure()
plt.plot(Cm_fitted, -depths, label='best-fitted model')
plt.errorbar(profile_data['C'],
             -profile_data['depth'],
             xerr=profile_data['std'],
             fmt='o', markersize=4,
             label='data')
plt.setp(plt.gca(),
         xlabel='10Be concentration [atoms g-1]',
         ylabel='-1 * depth [cm]',
         xlim=[0, None], ylim=[None, 0])

plt.legend(loc='lower right')


Out[15]:
<matplotlib.legend.Legend at 0x7f4b66d37b10>

Observations


In [15]: