Getting Started

This notebook gives a whirlwind overview of the ionchannelABC library and can be used for testing purposes of a first installation. The notebook follows the workflow for parameter inference of a generic T-type Ca2+ channel model.

It is recommended to have some understanding of ion channel models, voltage clamp protocols and fundamentals of the Approximate Bayesian Computation algorithm before working through this notebook. Wikipedia and the pyabc documentation will likely be sufficient.


In [ ]:
# Importing standard libraries
import numpy as np
import pandas as pd

Setting up an ion channel model and experiments

First we need to load in a cell model. We use IonChannelModel, which is a wrapper around the myokit simulation functionality which handles compilation of the model for use with the pyabc library. The model loads a MMT file which is a description of the mathematics behind the opening/closing of activation/inactivation gates in myokit format (see https://myokit.readthedocs.io/syntax/model.html). We also need to specify the independent variable name in the MMT file (generally transmembrane voltage) and a list of variables we want to log from simulations.


In [ ]:
from ionchannelABC import IonChannelModel

icat = IonChannelModel('icat',
                       'models/Generic_iCaT.mmt',
                       vvar='membrane.V',
                       logvars=['environment.time',
                                'icat.G_CaT',
                                'icat.i_CaT'])

Now that we have loaded a cell model, we need to specify how we will test it to compare with experimental data. We use the ExperimentData and ExperimentStimProtocol classes to specify the experimental dataset and experimental protocol respectively. These are then combined in the Experiment class. The data is specified in a separate .py file with functions to return the x, y and, if available, error bars extracted from graphs.

We show an example using T-type Ca2+ channel peak current density at a range of activating voltage steps in HL-1 myocytes from Nguyen et al, STIM1 participates in the contractile rhythmicity of HL-1 cells by moderating T-type Ca(2+) channel activity, 2013.


In [ ]:
import data.icat.data_icat as data
from ionchannelABC import (Experiment,
                           ExperimentData,
                           ExperimentStimProtocol)

vsteps, peak_curr, errs, N = data.IV_Nguyen()
nguyen_data = ExperimentData(x=vsteps, y=peak_curr,
                             N=N, errs=errs,
                             err_type='SEM') # this flag is currently not used but may change in future version

The stimulation protocol is defined from the experimental methods of the data source. It should be replicated as close as possible to reproduce experimental conditions. This example shows a standard 'I-V curve' testing peak current density at different voltage steps from a resting potential. The transmembrane potential is held at a resting potential of -75mV for sufficient time for the channel to reach its steady-state (we assume 5000ms here), it is stepped to each test potential for 300ms and then returned to the resting potential.


In [ ]:
stim_times = [5000, 300, 500] # describes the course of one voltage step in time
stim_levels = [-75, vsteps, -75] # each entry of levels corresponds to the time above

Having defined what we are doing with the model, we need to define what we do with the simulation data and which part of the protocol (i.e. index of stim_times and stim_levels) we are interested in extracting the data from. The simulation will return a list of pandas.Dataframe containing each of logvars defined in the ion channel model declaration. Here, we want to reduce this data to just the peak current density at the step potential (i.e. index 1 in stim_times and stim_levels). Our list will only have length 1 because we are only interested in data from this point in the protocol, but more complex protocols may return longer lists.


In [ ]:
def max_icat(data):
    return max(data[0]['icat.i_CaT'], key=abs)

nguyen_protocol = ExperimentStimProtocol(stim_times,
                                         stim_levels,
                                         measure_index=1, # index from `stim_times` and `stim_levels` 
                                         measure_fn=max_icat)

The final key part of defining the experiment is the experimental conditions, which includes extra/intracellular ion concentrations and temperature reported in the data source. Here, the dictionary keys refer to variables in the [membrane] field of the MMT ion channel definition file.

We can then combine the previous steps in a single Experiment.


In [ ]:
nguyen_conditions = dict(Ca_o=5000,     # extracellular Ca2+ concentration of 5000uM
                         Ca_subSL=0.2,  # sub-sarcolemmal (i.e. intracellular) Ca2+ concentration of 0.2uM
                         T=295)         # experiment temperature of 295K

nguyen_experiment = Experiment(nguyen_protocol, nguyen_data, nguyen_conditions)

We then add the experiment to the IonChannelModel defined previously. We can test it runs using the sample method with default parameters to debug any problems at this stage.


In [ ]:
icat.add_experiments([nguyen_experiment])
test = icat.sample({}) # empty dictionary as we are not overwriting any of the parameters in the model definition yet

The plot_sim_results function makes it easy to plot the output of simulations.


In [ ]:
import matplotlib.pyplot as plt
import seaborn as sns
from ionchannelABC import plot_sim_results
%matplotlib inline
plot_sim_results(test, obs=icat.get_experiment_data())

Clearly the default parameters in the MMT file are not quite right, but we are able to run the simulation and compare to the results.

In practice, the ion channel setup and model experiments can be defined in a separate .py file and loaded in a single step, which we will do below for the next step. Examples are contained in the channel examples folder. By plotting, we can see that 6 separate experiments have been defined.


In [ ]:
from channels.icat_generic import icat as model
test = model.sample({})
plot_sim_results(test, obs=model.get_experiment_data())

Setting up parameter inference for the defined model

Next we need to specify which parameters in our ion channel model should be varied during the parameter inference step. We do this by defining a prior distribution for each parameter in the MMT file we want to vary. The width of the prior distribution should be sufficient to reduce bias while incorporating specific knowledge about the model structure (i.e. if a parameter should be defined positive or in a reasonable range). A good rule-of-thumb is to use an order of magnitude around a parameter value in a previously published model of the channel, but the width can be increased in future runs of the ABC algorithm.


In [ ]:
from pyabc import (RV, Distribution)   # we use two classes from the pyabc library for this definition
limits = dict(g_CaT=(0, 2),            # these parameter keys are specific to the icat model being investigated
              v_offset=(0, 500),
              Vhalf_b=(-100, 100),
              k_b=(0, 10),
              c_bb=(0, 10),
              c_ab=(0, 100),
              sigma_b=(0, 100),
              Vmax_b=(-100, 100),
              Vhalf_g=(-100, 100),
              k_g=(-10, 0),
              c_bg=(0, 50),
              c_ag=(0, 500),
              sigma_g=(0, 100),
              Vmax_g=(-100, 100))
prior = Distribution(**{key: RV("uniform", a, b - a)
                        for key, (a,b) in limits.items()})

We can now define additional requirements for the ABC-SMC algorithm. We need a distance function to measure how well our model can approximate experimental data.

The IonChannelDistance class implements a weighted Euclidean distance function. The weight assigned to each data point accounts for the separate experiments (i.e. we do not want to over-fit to behaviour of an experiment just because it has a greater number of data points), the scale of the dependent variable in each experiment, and the size of errors bars in the experimental data (i.e. if we prefer the model to reproduce more closely data points with a lower level of uncertainty).

We can see how this corresponds to the data we are using in this example by plotting the data points using plot_distance_weights.


In [ ]:
from ionchannelABC import (IonChannelDistance, plot_distance_weights)

measurements = model.get_experiment_data()
obs = measurements.to_dict()['y']
exp = measurements.to_dict()['exp']
errs = measurements.to_dict()['errs']

distance_fn = IonChannelDistance(obs=obs, exp_map=exp, err_bars=errs, err_th=0.1)
plot_distance_weights(model, distance_fn)

We also need to assign a database file for the pyabc implementation of the ABC-SMC algorithm to store information about the ABC particles at intermediate steps as it runs. A temporary location with sufficient storage is a good choice as these files can become quite large for long ABC runs. This can be defined by setting the $TMPDIR environment variable as described in the installation instructions.

The "sqlite:///" at the start of the path is necessary for database access.


In [ ]:
import tempfile, os
db_path = ("sqlite:///" + 
           os.path.join(tempfile.gettempdir(), "example.db"))
print(db_path)

Running the ABC algorithm

We are now ready to run parameter inference on our ion channel model.

Before starting the algorithm, it is good practice to enable logging options to help any debugging which may be necessary. The default options below should be sufficient.


In [ ]:
import logging
logging.basicConfig()
abc_logger = logging.getLogger('ABC')
abc_logger.setLevel(logging.DEBUG)
eps_logger = logging.getLogger('Epsilon')
eps_logger.setLevel(logging.DEBUG)
cv_logger = logging.getLogger('CV Estimation')
cv_logger.setLevel(logging.DEBUG)

ABCSMC from the pyabc library is the main class used for the algorithm. It initialises with a number of options which are well described in the pyabc documentation. Note we initialize some of the passed objects at this stage and do not pass in pre-initialised variables, particulary for the distance function.

A brief description is given below to key options:

  • population_size: Number of particles to use in the ABC algorithm. pyabc ConstantPopulationSize and AdaptivePopulationSize have been tested. Unless adaptive population size is explicitly required, it is recommended to use a constant particle population with sufficient population for the size of the model being tested to avoid parameter distributions collapsing on single point estimates. For this example, we will use 2000, however up to 5000 particles has been tested on more complex models. Larger particle populations will increase algorithm run times.
  • summary_statistics: Function to convert raw output from the model into an appropriate format for calculating distance. Use the custom implementation of ion_channel_sum_stats_calculator.
  • transitions: pyabc Transition object for pertubation of particles at each algorithm step. Use custom implementation of EfficientMultivariateNormalTransition.
  • eps: pyabc Epsilon object defining how acceptance threshold is adapted over iterations. Generally use MedianEpsilon for the median distance of the previous iterations accepted particles.
  • sampler: Can be used to specify the number of parallel processes to initiate. Only pyabc MulticoreEvalParallelSampler has been tested. If on local machine, initiate with default parameters. If using computing cluster, the parameter n_procs can specify how many processes to initiate (12 is a good starting point). Warning: increasing the number of processes will not necessarily speed up the algorithm.
  • acceptor: pyabc Acceptor object decides which particles to allow to pass to the next iteration. Use custom implementation IonChannelAcceptor.

In [ ]:
from pyabc import ABCSMC
from pyabc.epsilon import MedianEpsilon
from pyabc.populationstrategy import ConstantPopulationSize
from pyabc.sampler import MulticoreEvalParallelSampler

In [ ]:
from ionchannelABC import (ion_channel_sum_stats_calculator,
                           IonChannelAcceptor,
                           IonChannelDistance,
                           EfficientMultivariateNormalTransition)

In [ ]:
abc = ABCSMC(models=model,
             parameter_priors=prior,
             distance_function=IonChannelDistance(
                 obs=obs,
                 exp_map=exp,
                 err_bars=errs,
                 err_th=0.1),
             population_size=ConstantPopulationSize(1000),
             summary_statistics=ion_channel_sum_stats_calculator,
             transitions=EfficientMultivariateNormalTransition(),
             eps=MedianEpsilon(),
             sampler=MulticoreEvalParallelSampler(n_procs=12),
             acceptor=IonChannelAcceptor())

The algorithm is initialised and run as specified in pyabc documentation. These lines are not set to run as the algorithm can take several hours to days to finish for large models. Following steps will use a previous run example.

abc_id = abc.new(db_path, obs)
history = abc.run(minimum_epsilon=0.1,
                  max_nr_populations=20,
                  min_acceptance_rate=0.01)

Analysing the results

Once the ABC run is complete, we have a number of custom plotting function to analyse the History object output from the running the ABC algorithm.

A compressed example database file is can be found here. On Linux, this can be extracted to the original .db format using tar -xcvf hl-1_icat-generic.tgz.

Firstly, we can load a previously run example file.


In [ ]:
from pyabc import History
history = History('sqlite:///results/icat-generic/hl-1_icat-generic.db')
history.all_runs()

In [ ]:
df, w = history.get_distribution(m=0)

First we can check the convergence of the epsilon value over iterations of the ABC algorithm.


In [ ]:
evolution = history.get_all_populations()
sns.relplot(x='t', y='epsilon', size='samples', data=evolution[evolution.t>=0])

We can check the posterior distribution of parameters for this model using the plot_parameters_kde function. This can highlight any parameters which were unidentifiable given the available experimental data.


In [ ]:
from ionchannelABC import plot_parameters_kde
plot_parameters_kde(df, w, limits, aspect=12, height=0.8)

We can generate some samples of model output using the posterior distribution of parameters to observe the effect on model output. We first create a sampling dataset then use the plot_sim_results function.


In [ ]:
n_samples = 10 # increasing this number will produce a better approximation to the true output, recommended: >= 100
               # we keep 10 to keep running time low
parameter_samples = df.sample(n=n_samples, weights=w, replace=True)
parameter_samples.head()

In [ ]:
parameter_samples = parameter_samples.to_dict(orient='records')
samples = pd.DataFrame({})
for i, theta in enumerate(parameter_samples):
    output = model.sample(pars=theta, n_x=50) # n_x changes the resolution of the independent variable
                                              # sometimes this can cause problems with output tending to zero/inf at
                                              # (e.g.) exact reversal potential of the channel model
    output['sample'] = i
    output['distribution'] = 'posterior'
    samples = samples.append(output, ignore_index=True)

In [ ]:
g = plot_sim_results(samples, obs=measurements)
xlabels = ["voltage, mV", "voltage, mV", "voltage, mV", "time, ms", "time, ms","voltage, mV"]
ylabels = ["current density, pA/pF", "activation", "inactivation", "recovery", "normalised current","current density, pA/pF"]
for ax, xl in zip(g.axes.flatten(), xlabels):
    ax.set_xlabel(xl)
for ax, yl in zip(g.axes.flatten(), ylabels):
    ax.set_ylabel(yl)

In this example, we see low variation of the model output around the experimental data across experiments. However, are all parameters well identified? (Consider the KDE posterior parameter distribution plot).

Finally, if we want to output quantitative measurements of the channel model we can interrogate out sampled dataset. For example, we can find the peak current density from the first experiment.


In [ ]:
peak_curr_mean = np.mean(samples[samples.exp==0].groupby('sample').min()['y'])
peak_curr_std = np.std(samples[samples.exp==0].groupby('sample').min()['y'])

print('Peak current density: {0:4.2f} +/- {1:4.2f} pA/pF'.format(peak_curr_mean, peak_curr_std))

Or if we are interested in the voltage at which the peak current occurs.


In [ ]:
peak_curr_V_indices = samples[samples.exp==0].groupby('sample').idxmin()['y']
peak_curr_V_mean = np.mean(samples.iloc[peak_curr_V_indices]['x'])
peak_curr_V_std = np.std(samples.iloc[peak_curr_V_indices]['x'])

print('Voltage of peak current density: {0:4.2f} +/- {1:4.2f} mV'.format(peak_curr_V_mean, peak_curr_V_std))

That concludes the main portion of this introduction. Further functionality is included below. For further examples of using the library, see the additional notebooks included for multiple HL-1 cardiac myocyte ion channels in the docs/examples folder.


Extra: Parameter sensitivity

The ionchannelABC library also includes functionality to test the sensitivity of a model to its parameters. This could be used to test which parameters we may expect to be unidentifiable in the ABC algorithm and would generally be carried out before the ABC algorithm is run.

The parameter sensitivity analysis is based on Sobie et al, Parameter sensitivity analysis in electrophysiological models using multivariable regression, 2009.

First, we need to define the distance function used and a list of the full name (including field in the MMT file) of parameters being passed to ABC.


In [ ]:
distance_fn = IonChannelDistance(obs=obs,
                                 exp_map=exp,
                                 err_bars=errs,
                                 err_th=0.1)
parameters = ['icat.'+k for k in limits.keys()]
print(parameters)

The calculate_parameter_sensitivity function carries out the calculations, and the output can be analysed using the plot_parameter_sensitivity and plot_regression_fit functions.


In [ ]:
from ionchannelABC import (calculate_parameter_sensitivity,
                           plot_parameter_sensitivity,
                           plot_regression_fit)

In [ ]:
fitted, regression_fit, r2 = calculate_parameter_sensitivity(
    model,
    parameters,
    distance_fn,
    sigma=0.05,    # affects how far parameters are perturbed from original values to test sensitivity
    n_samples=20)  # set to reduced value for demonstration, typically around 1000 in practical use

See Sobie et al, 2009 for an interpretation of the beta values and goodness-of-fit plots. In summary, a high beta value indicates the model has high sensitivity to changes in that parameter for a particular experiment protocol. However, this is conditional on a reasonable goodness-of-fit indicating the multivariable regression model is valid within this small pertubation space.


In [ ]:
plot_parameter_sensitivity(fitted, plot_cutoff=0.05)

In [ ]:
plot_regression_fit(regression_fit, r2)

In [ ]: