Workshop Tutorial: Note that all slides have been optimized for a RISE slideshow


In [1]:
# Ignore warnings that distract from the tutorial
import warnings
warnings.simplefilter("ignore", category=FutureWarning)

Validating Computational Models with



Richard C. (Rick) Gerkin, PhD

Associate Research Professor, School of Life Sciences
Co-Director, Laboratory for Informatics and Computation in Open Neuroscience
Arizona State University, Tempe, AZ USA


[SciUnit](http://sciunit.scidash.org) is a framework for validating scientific models by creating experimental-data-driven unit tests.

Conventionally, a unit test is “a strict, written contract that the piece of code must satisfy''

SciUnit extends this idea from generic computer programs to scientific models, making scientific model validation both formal and transparent.

The validity of a model is then represented by the collection of unit tests that it passes.

[SciUnit](http://sciunit.scidash.org) is a Python package. Let's install it, then make sure it can be imported.


In [2]:
# Installation of the `sciunit` package from PyPI
!pip install -q sciunit

# Import the package
import sciunit

In [3]:
# This code cell exists just to generate the mockup table a few cells below.
# Further down you will see that SciUnit can generate such tables from scores automatically!
from IPython.display import display, HTML
display(HTML("""
<style>  
    td.red {
        background-color: #FF0000;
        border: 1px solid black;
    }
    td.green {
        background-color: #00FF00;
        border: 1px solid black;
    }
    td.grey {
        background-color: #AAAAAA;
        border: 1px solid black;
    }
    th {
        text-align: center;
        border: 1px solid black;
    }
    table td, table th {
    border: 10px solid black;
    font-size: 250%;
    }
</style>
"""))


Toy example: A brief history of cosmology

Experimentalists
Modelers
BabyloniansBraheGalileoLe Verrier
Ptolemy
Copernicus
Kepler
Newton
Einstein
PassFailUnclear

Model validation goals:

- Generate one unit tests for each experimental datum (or stylized fact about data)

- Execute these tests against all models capable of taking them

- Programatically display the results as a “dashboard" of model validity

  • Optionally record and display non-boolean test results, test artifacts, etc.

High-level workflow for validation:

# Hypothetical examples of data-driven tests
from cosmounit.tests import brahe_test, galileo_test, leverrier_test

# Hypothetical examples of parameterized models
from cosmounit.models import ptolemy_model, copernicus_model 

# Execute one test against one model and return a test score
score = brahe_test.judge(copernicus_model)

This is the only code-like cell of the tutorial that doesn't contain executable code, since it is a high-level abstraction. Don't worry, you'll be running real code just a few cells down!

Q: How does a test “know" how to test a model?

A: Through guarantees that models provide to tests, called “Capabilities".

Next we show an example of a Capability relevant to the cosmology case outlined above.


In [4]:
# Some imports to make the code below run
from math import pi, sqrt, sin, cos, tan, atan
from datetime import datetime, timedelta
import numpy as np
# SymPy is needed because one of Kepler's equations
# is in implicit form and must be solved numerically!
from sympy import Symbol, solvers, sin as sin_

In [5]:
class ProducesOrbitalPosition(sciunit.Capability):
    """
    A model `capability`, i.e. a collection of methods that a test is allowed to invoke on a model.
    These methods are unimplemented by design, and the model must implement them.
    """
    
    def get_position(self, t: datetime) -> tuple:
        """Produce an orbital position from a time point
        in polar coordinates.
        
        Args:
            t (datetime): The time point to examine, relative to perihelion
        Returns:
            tuple: A pair of (r, theta) coordinates in the oribtal plane
        """
        raise NotImplementedError("")
        
    @property
    def perihelion(self) -> datetime:
        """Return the time of last perihelion"""
        raise NotImplementedError("")
    
    @property
    def period(self) -> float:
        """Return the period of the orbit"""
        raise NotImplementedError("")
    
    @property
    def eccentricity(self) -> float:
        """Return the eccentricity of the orbit"""
        raise NotImplementedError("")
    
    def get_x_y(self, t: datetime) -> tuple:
        """Produce an orbital position from a time point, but in cartesian coordinates.
        This method does not require a model-specific implementation.
        Thus, a generic implementation can be provided in advance."""
        r, theta = self.get_position(t)
        x, y = r*cos(theta), r*sin(theta)
        return x, y

[SciUnit](http://sciunit.scidash.org) (and domain specific libraries that build upon it) also define their own capabilities


In [6]:
# An extremely generic model capability
from sciunit.capabilities import ProducesNumber

# A specific model capability used in neurophysiology
#from neuronunit.capabilities import HasMembranePotential

Now we can define a model class that implements this ProducesOrbitalPosition capability by inheritance.

All models are subclasses of sciunit.Model and typically one or more subclasses of sciunit.Capability.


In [7]:
class BaseKeplerModel(sciunit.Model, 
                      ProducesOrbitalPosition):
    """A sciunit model class corresponding to a Kepler-type model
    of an object in the solar system. This model has the 
    `ProducesOrbitalPosition` capability by inheritance,
    so it must implement all of the unimplemented methods of that capability"""
    
    def get_position(self, t):
        """Implementation of polar coordinate position as a function of time"""
        r, theta = self.heliocentric_distance(t), self.true_anomaly(t)
        return r, theta
    
    @property
    def perihelion(self):
        """Implementation of time of last perihelion"""
        return self.params['perihelion']
    
    @property
    def period(self):
        """Implementation of period of the orbit"""
        return self.params['period']
    
    @property
    def eccentricity(self):
        """Implementation of orbital eccentricity (assuming elliptic orbit)"""
        a, b = self.params['semimajor_axis'], self.params['semiminor_axis']
        return sqrt(1 - (b/a)**2)

In [8]:
class KeplerModel(BaseKeplerModel):
    """This 'full' model contains all of the methods required
    to complete the implementation of the `ProducesOrbitalPosition` capability"""
    
    def mean_anomaly(self, t):
        """How long into its period the object is at time `t`"""
        time_since_perihelion = t - self.perihelion
        return 2*pi*(time_since_perihelion % self.period)/self.period
    
    def eccentric_anomaly(self, t):
        """How far the object has gone into its period at time `t`"""
        E = Symbol('E')
        M, e = self.mean_anomaly(t), self.eccentricity
        expr = E - e*sin_(E) - M
        return solvers.nsolve(expr, 0)
    
    def true_anomaly(self, t):
        """Theta in a polar coordinate system at time `t`"""
        e, E = self.eccentricity, self.eccentric_anomaly(t)
        theta = 2*atan(sqrt(tan(E/2)**2 * (1+e)/(1-e)))
        return theta
    
    def heliocentric_distance(self, t):
        """R in a polar coordinate system at time `t`"""
        a, e = self.params['semimajor_axis'], self.eccentricity
        E = self.eccentric_anomaly(t)
        return a*(1-e*cos(E))

Now we can instantiate a specific model from this class, e.g. one representing the orbital path of Earth (according to Kepler)


In [9]:
# The quantities module to put dimensional units on values
import quantities as pq

# `earth_model` will be a specific instance of KeplerModel, with its own parameters
earth_model = KeplerModel(name = "Kepler's Earth Model",
                          semimajor_axis=149598023 * pq.km, 
                          semiminor_axis=149577161 * pq.km, 
                          period=timedelta(365, 22118), # Period of Earth's orbit 
                          perihelion=datetime(2019, 1, 3, 0, 19), # Time and date of Earth's last perihelion
                         )

We can use this model to make specific predictions, for example the current distance between Earth and the sun.


In [10]:
# The time right now
t = datetime.now()
# Predicted distance from the sun, right now
r = earth_model.heliocentric_distance(t)
print("Heliocentric distance of Earth right now is predicted to be %s" % r.round(1))


Heliocentric distance of Earth right now is predicted to be 151702813.7 km

Now let's build a test class that we might use to validate (i.e. unit test to produce test scores) with this (and hopefully other) models

First, what kind of scores do we want our test to return?


In [11]:
# Several score types available in SciUnit
from sciunit.scores import BooleanScore, ZScore, RatioScore, PercentScore # etc., etc.

Here's a first shot a test class for assessing the agreement between predicted and observed positions of orbiting objects. All test classes are subclasses of sciunit.Test.


In [12]:
class PositionTest(sciunit.Test):
    """A test of a planetary position at some specified time"""
    
    # This test can only operate on models that implement
    # the `ProducesOrbitalPosition` capability.
    required_capabilities = (ProducesOrbitalPosition,)
    score_type = BooleanScore # This test's 'judge' method will return a BooleanScore.
    
    def generate_prediction(self, model):
        """Generate a prediction from a model"""
        t = self.observation['t'] # Get the time point from the test's observation
        x, y = model.get_x_y(t) # Get the predicted x, y coordinates from the model
        return {'t': t, 'x': x, 'y': y} # Roll this into a model prediction dictionary
        
    def compute_score(self, observation, prediction):
        """Compute a test score based on the agreement between
        the observation (data) and prediction (model)"""
        # Compare observation and prediction to get an error measure
        delta_x = observation['x'] - prediction['x']
        delta_y = observation['y'] - prediction['y']
        error = np.sqrt(delta_x**2 + delta_y**2)
        
        passing = bool(error < 1e5*pq.kilometer) # Turn this into a True/False score
        score = self.score_type(passing) # Create a sciunit.Score object
        score.set_raw(error) # Add some information about how this score was obtained
        score.description = ("Passing score if the prediction is "
                             "within < 100,000 km of the observation") # Describe the scoring logic
        return score

We might want to include extra checks and constraints on observed data, test parameters, or other contingent testing logic.


In [13]:
class StricterPositionTest(PositionTest):
    # Optional observation units to validate against
    units = pq.meter
    
    # Optional schema for the format of observed data
    observation_schema = {'t': {'min': 0, 'required': True},
                          'x': {'units': True, 'required': True},
                          'y': {'units': True, 'required': True},
                          'phi': {'required': False}}
    
    def validate_observation(self, observation):
        """Additional checks on the observation"""
        assert isinstance(observation['t'], datetime)
        return observation
        
    # Optional schema for the format of test parameters
    params_schema = {'rotate': {'required': False}}

    # Optional schema for the format of default test parameters
    default_params = {'rotate': False}
    
    def compute_score(self, observation, prediction):
        """Optionally use additional information to compute model/data agreement"""
        observation_rotated = observation.copy()
        if 'phi' in observation:
            # Project x and y values onto the plane defined by `phi`.
            observation_rotated['x'] *= cos(observation['phi'])
            observation_rotated['y'] *= cos(observation['phi'])
        return super().compute_score(observation_rotated, prediction)

Now we can instantiate a test. Each test instance is a combination of the test class, describing the testing logic and required capabilties, plus some 'observation', i.e. data.


In [14]:
# A single test instance, best on the test class `StricterPositionTest` combined with
# a specific set of observed data (a time and some x, y coordinates)
# N.B.: This data is made up for illustration purposes
earth_position_test_march = StricterPositionTest(name = "Earth Orbital Data on March 1st, 2019",
                                                 observation = {'t': datetime(2019, 3, 1),
                                                                'x': 7.905e7 * pq.km, 
                                                                'y': 1.254e8 * pq.km})

Finally, we can execute this one test against this one model


In [15]:
# Execute `earth_position_test` against `earth_model` and return a score
score = earth_position_test_march.judge(earth_model)
# Display the score
score


Out[15]:
Pass

And we can get additional information about the test, including intermediate objects computed in order to generate a score.


In [16]:
# Describe the score in plain language
score.describe()


Passing score if the prediction is within < 100,000 km of the observation

In [17]:
# What were the prediction and observation used to compute the score?
score.prediction, score.observation


Out[17]:
({'t': datetime.datetime(2019, 3, 1, 0, 0),
  'x': array(79046604.57417324) * km,
  'y': array(1.25401809e+08) * km},
 {'t': datetime.datetime(2019, 3, 1, 0, 0),
  'x': array(79050000.) * km,
  'y': array(1.254e+08) * km})

In [18]:
# What was the raw error before the decision criterion was applied?
score.get_raw()


Out[18]:
array(3847.28076371) * km

We may want to bundle many such tests into a TestSuite. This suite may contain test from multiple classes, or simply tests which differ only in the observation (data) used to instantiate them.


In [19]:
# A new test for a new month: same test class, new observation (data)
# N.B. I deliberately picked "observed" values that will make the model fail this test
earth_position_test_april = StricterPositionTest(name = "Earth Orbital Data on April 1st, 2019",
                                                 observation = {'t': datetime(2019, 4, 1),
                                                                'x': 160000 * pq.km, 
                                                                'y': 70000 * pq.km})

# A test suite built from both of the tests that we have instantiated
earth_position_suite = sciunit.TestSuite([earth_position_test_march,
                                          earth_position_test_april],
                                         name = 'Earth observations in Spring, 2019')

We can then test our model against this whole suite of tests


In [20]:
# Run the whole suite (two tests) against one model
scores = earth_position_suite.judge(earth_model)


Executing test Earth Orbital Data on March 1st, 2019 on model Kepler's Earth Model...
Score is Pass
Executing test Earth Orbital Data on April 1st, 2019 on model Kepler's Earth Model...
Score is Fail

Rich HTML output is automatically produced when this score output is summarized


In [21]:
# Display the returned `scores` object
scores


Out[21]:
Earth Orbital Data on March 1st, 2019 Earth Orbital Data on April 1st, 2019
Kepler's Earth Model Pass Fail

We can then expand this to multiple models


In [22]:
# Just like the Kepler model, but returning a random orbital angle
class RandomModel(KeplerModel):
    def get_position(self, t):
        r, theta = super().get_position(t)
        return r, 2*pi*np.random.rand()

In [23]:
# A new model instance, using the same parameters but a different underlying model class
random_model = RandomModel(name = "Random Earth Model",
                           semimajor_axis=149598023 * pq.km, 
                           semiminor_axis=149577161 * pq.km, 
                           period=timedelta(365, 22118), # Period of Earth's orbit 
                           perihelion=datetime(2019, 1, 3, 0, 19), # Time and date of Earth's last perihelion
                           )

In [24]:
# Run the whole suite (two tests) against two models
scores = earth_position_suite.judge([earth_model, random_model])


Executing test Earth Orbital Data on March 1st, 2019 on model Kepler's Earth Model...
Score is Pass
Executing test Earth Orbital Data on April 1st, 2019 on model Kepler's Earth Model...
Score is Fail
Executing test Earth Orbital Data on March 1st, 2019 on model Random Earth Model...
Score is Fail
Executing test Earth Orbital Data on April 1st, 2019 on model Random Earth Model...
Score is Fail

In [25]:
# Display the returned `scores` object
scores


Out[25]:
Earth Orbital Data on March 1st, 2019 Earth Orbital Data on April 1st, 2019
Kepler's Earth Model Pass Fail
Random Earth Model Fail Fail

Or extract just a slice:


In [26]:
# All the scores for just one model
scores[earth_model]


Out[26]:
Earth Orbital Data on March 1st, 2019    Pass
Earth Orbital Data on April 1st, 2019    Fail
Name: Kepler's Earth Model, dtype: object

In [27]:
# All the scores for just one test
scores[earth_position_test_march]


Out[27]:
Kepler's Earth Model    Pass
Random Earth Model      Fail
Name: Earth Orbital Data on March 1st, 2019, dtype: object

What about models that can't take a certain test? Some models aren't capable (even in principle) of doing what the test is asking of them.


In [28]:
# A simple model which has some capabilities, 
# but not the ones needed for the orbital position test
class SimpleModel(sciunit.Model,
                  sciunit.capabilities.ProducesNumber):
    pass

simple_model = SimpleModel()

In [29]:
# Run the whole suite (two tests) against two models
scores = earth_position_suite.judge([earth_model, random_model, simple_model])


Executing test Earth Orbital Data on March 1st, 2019 on model Kepler's Earth Model...
Score is Pass
Executing test Earth Orbital Data on April 1st, 2019 on model Kepler's Earth Model...
Score is Fail
Executing test Earth Orbital Data on March 1st, 2019 on model Random Earth Model...
Score is Fail
Executing test Earth Orbital Data on April 1st, 2019 on model Random Earth Model...
Score is Fail
Executing test Earth Orbital Data on March 1st, 2019 on model SimpleModel...
Score is N/A
Executing test Earth Orbital Data on April 1st, 2019 on model SimpleModel...
Score is N/A

Incapable models don't fail, they get the equivalent of 'incomplete' grades


In [30]:
# Display the returned `scores` object
scores


Out[30]:
Earth Orbital Data on March 1st, 2019 Earth Orbital Data on April 1st, 2019
Kepler's Earth Model Pass Fail
Random Earth Model Fail Fail
SimpleModel N/A N/A

[SciUnit](http://sciunit.scidash.org) is in use in several multiscale modeling projects including:

- The Human Brain Project (neurophysiology, neuroanatomy, neuroimaging)

- OpenWorm (biophysics, network dynamics, animal behavior)

[NeuronUnit](http://neuronunit.scidash.org) is a reference implementation in the domain of neurophysiology of:

- model classes

- test classes

- capability classes

- tools for constructing tests from several public neurophysiology databases

- tools for implementing capabilities from standard model exchange formats

- test-driven model optimization

[SciDash](http://dash.scidash.org) is a web application for creating, scheduling, and viewing the results of SciUnit tests without writing a single line of code.


Links:


Funded by:

R01DC018455 (NIDCD) R01MH106674 (NIMH)
R01EB021711 (NIBIB) Human Brain Project

Thanks also to Sharon Crook, Justas Birgiolas, Russell Jarvis, and Cyrus Omar