Datasets of nuclide concentrations vs. depth

Tools for creating generic datasets for testing + some real datasets.

Generic datasets

Generate nuclide concentration vs. depth values from a model and add random perturbations.

Realistic perturbations have each a stochastic component and a deterministic component (depends on the concentration value). Each perturbation is thus generated using a Gaussian of $\mu_p = 0$ and $\sigma_p$ given by another Gaussian:

  • $\mu$ = $\sqrt{C}$ * err_magnitude
  • $\sigma$ = $\sqrt{C}$ * err_variability

The generated dataset is returned in a :class:pandas.DataFrame object. The following code is saved as the Python module gendata so that it can be re-used in other notebooks.


In [1]:
%%writefile gendata.py

"""
Create generic datasets of
nuclide concentrations vs. depth.

"""

import numpy as np
import pandas as pd

import models


def generate_dataset(model, model_args, model_kwargs=None,
                     zlimits=[50, 500], n=10,
                     err=[20., 5.]):
    """
    Create a generic dataset of nuclide concentrations
    vs. depth (for testing).
    
    Parameters
    ----------
    model : callable
        the model to use for generating the data
    model_args : list, tuple
        arguments to pass to `model`
    model_kwargs : dict
        keyword arguments to pass to `model`
    zlimits : [float, float]
        depths min and max values
    n : int
        sample size
    err : float or [float, float]
        fixed error (one value given) or
        error magnitude and error variability
        (two values given, see below)
    
    Returns
    -------
    :class:`pandas.DataFrame` object
    
    Notes
    -----
    The returned dataset corresponds to
    concentration values predicted by
    the model + random perturbations.
    
    When one value is given for `err`, the
    parturbations are all generated using a
    Gaussian of mu=0 and sigma=fixed error.
    
    When two values are given for `err`, each
    perturbation is generated using a Gaussian
    of mu=0 and sigma given by another Gaussian: 
    
    mu =  sqrt(concentration) * error magnitude
    sigma = sqrt(concentration) * error variability
    
    """

    zmin, zmax = zlimits
    model_kwargs = model_kwargs or dict()
    
    depths = np.linspace(zmin, zmax, n)
    
    profile_data = pd.DataFrame()

    profile_data['depth'] = depths
    profile_data['C'] = model(profile_data['depth'],
                              *model_args,
                              **model_kwargs)

    try:
        err_magn, err_var = err
        
        err_mu = err_magn * np.sqrt(profile_data['C'])
        err_sigma = err_var * np.sqrt(profile_data['C'])

        profile_data['std'] = np.array(
            [np.random.normal(loc=mu, scale=sigma)
             for mu, sigma in zip(err_mu, err_sigma)]
            )
    except TypeError:
        profile_data['std'] = np.ones_like(depths) * err

    error = np.array([np.random.normal(scale=std)
                      for std in profile_data['std']])
    profile_data['C'] += error
    
    return profile_data


Overwriting gendata.py

Real datasets

Create a folder to save the dataset files.


In [2]:
%mkdir profiles_data


mkdir: cannot create directory `profiles_data': File exists

Lodomez 10Be

Dataset collected in the upper Amblève valley (NE Belgium). See Rixhon et al., 2011


In [3]:
%%writefile profiles_data/lodomez_10Be_profile_data.csv
"sample" "depth" "depth_g-cm-2" "C" "std" "nuclide"
"s01" 250 451 43005  1695  "10Be"
"s02" 200 361 94800  2024  "10Be"
"s03" 165 298 148569 3621  "10Be"
"s04" 100 181 269566 5038  "10Be"
"s05" 50  90  432800 11714 "10Be"


Overwriting profiles_data/lodomez_10Be_profile_data.csv

In [4]:
%%writefile profiles_data/lodomez_10Be_settings.yaml
# 10Be surface production rate
P_0: 6.13

# sample site latitude [degrees]
latitude: 50.39

# sample site altitude [meters]
altitude: 283.0

# pressure [hPa]
pressure: 979.711


Overwriting profiles_data/lodomez_10Be_settings.yaml

Belleroche 10Be

Dataset collected in the lower Amblève valley (NE Belgium). See Rixhon et al., 2011


In [5]:
%%writefile profiles_data/belleroche_10Be_profile_data.csv
"sample" "depth" "depth_g-cm-2" "C" "std" "nuclide"
"s01" 300 643 46216  1728 "10Be"
"s02" 200 429 64965  2275 "10Be"
"s03" 150 322 128570 2766 "10Be"
"s04" 100 214 191825 3303 "10Be"
"s05" 60  129 323967 5454 "10Be"


Overwriting profiles_data/belleroche_10Be_profile_data.csv

In [6]:
%%writefile profiles_data/belleroche_10Be_settings.yaml
# 10Be surface production rate
P_0: 5.3

# sample site latitude [degrees]
latitude: 50.48

# sample site altitude [meters]
altitude: 153.0

# pressure [hPa]
pressure: 995.004


Overwriting profiles_data/belleroche_10Be_settings.yaml

Colonster 10Be

Dataset collected in the Ourthe valley (NE Belgium). See Rixhon et al., 2011


In [7]:
%%writefile profiles_data/colonster_10Be_profile_data.csv
"sample" "depth" "depth_g-cm-2" "C" "std" "nuclide"
"s01" 450 886 118424 6928  "10Be"
"s02" 400 788 81698  5991  "10Be"
"s03" 350 689 133908 4949  "10Be"
"s04" 300 591 133243 8756  "10Be"
"s06" 200 394 255119 9940  "10Be"
"s07" 175 345 333152 11792 "10Be"
"s08" 150 295 387154 14811 "10Be"
"s09" 125 246 436636 17066 "10Be"
"s10" 100 197 710515 24670 "10Be"


Overwriting profiles_data/colonster_10Be_profile_data.csv

In [8]:
%%writefile profiles_data/colonster_10Be_settings.yaml
# 10Be surface production rate
P_0: 5.21

# sample site latitude [degrees]
latitude: 50.58

# sample site altitude [meters]
altitude: 134.0

# pressure [hPa]
pressure: 997.256


Overwriting profiles_data/colonster_10Be_settings.yaml

Romont 10Be & 26Al

Dataset collected in the Meuse valley (NE Belgium). See Rixhon et al., 2011


In [9]:
%%writefile profiles_data/romont_10Be_26Al_profile_data.csv
"sample" "depth" "depth_g-cm-2" "C" "std" "nuclide"
"s01" 750 1500 193732  5114   "10Be"
"s02" 650 1300 261858  6039   "10Be"
"s03" 550 1100 136098  13147  "10Be"
"s04" 450 900  186859  5153   "10Be"
"s05" 350 700  333915  7973   "10Be"
"s06" 310 620  654394  10387  "10Be"
"s07" 750 1500 702042  33437  "26Al"
"s08" 650 1300 992018  49251  "26Al"
"s09" 550 1100 467655  39998  "26Al"
"s10" 450 900  923354  45139  "26Al"
"s11" 350 700  1489555 126714 "26Al"
"s12" 310 620  2573447 301870 "26Al"


Writing profiles_data/romont_10Be_26Al_profile_data.csv

In [10]:
%%writefile profiles_data/romont_10Be_26Al_settings.yaml
# 10Be surface production rate
P_0: 5.09

# sample site latitude [degrees]
latitude: 50.78

# sample site altitude [meters]
altitude: 109.0

# pressure [hPa]
pressure: 1000.224


Writing profiles_data/romont_10Be_26Al_settings.yaml