Causal discovery with TIGRAMITE

TIGRAMITE is a time series analysis python module. It allows to reconstruct graphical models (conditional independence graphs) from discrete or continuously-valued time series based on the PCMCI method and create high-quality plots of the results.

PCMCI is described here: J. Runge, P. Nowack, M. Kretschmer, S. Flaxman, D. Sejdinovic, Detecting and quantifying causal associations in large nonlinear time series datasets. Sci. Adv. 5, eaau4996 (2019) https://advances.sciencemag.org/content/5/11/eaau4996

This tutorial explains the missing values and masking and gives walk-through examples. See the following paper for theoretical background: Runge, Jakob. 2018. “Causal Network Reconstruction from Time Series: From Theoretical Assumptions to Practical Estimation.” Chaos: An Interdisciplinary Journal of Nonlinear Science 28 (7): 075310.

Last, the following Nature Communications Perspective paper provides an overview of causal inference methods in general, identifies promising applications, and discusses methodological challenges (exemplified in Earth system sciences): https://www.nature.com/articles/s41467-019-10105-3


In [8]:
# Imports
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
%matplotlib inline     
## use `%matplotlib notebook` for interactive figures
# plt.style.use('ggplot')
import sklearn

import tigramite
from tigramite import data_processing as pp
from tigramite import plotting as tp
from tigramite.pcmci import PCMCI
from tigramite.independence_tests import ParCorr, GPDC, CMIknn, CMIsymb
from tigramite.models import LinearMediation, Prediction

Missing values and masking

Missing values

Tigramite consistently handles missing values. For example, missing values denoted as 999. in the data can be flagged with ParCorr.set_dataframe(data, missing_flag=999.). Then all time slices of samples where missing values occur in any variable are dismissed while consistently handling time lags. To avoid biases also subsequent samples for all lags up to 2*tau_max are dismissed. Missing values and masking will be described in more detail in a future paper.


In [2]:
np.random.seed(1)
data = np.random.randn(100, 3)
for t in range(1, 100):
    data[t, 0] += 0.7*data[t-1, 0] 
    data[t, 1] += 0.6*data[t-1, 1] + 0.6*data[t-1,0]
    data[t, 2] += 0.5*data[t-1, 2] + 0.6*data[t-1,1]
# Randomly mark 10% of values as missing values in variable 2
data[np.random.permutation(100)[:10], 2] = 999.

# Initialize dataframe object, specify time axis and variable names
var_names = [r'$X^0$', r'$X^1$', r'$X^2$', r'$X^3$']
dataframe = pp.DataFrame(data, 
                         datatime = np.arange(len(data)), 
                         var_names=var_names,
                         missing_flag=999.)

tp.plot_timeseries(dataframe)
pcmci_parcorr = PCMCI(dataframe=dataframe, cond_ind_test=ParCorr(verbosity=3), verbosity=4)
results = pcmci_parcorr.run_pcmci(tau_max=2, pc_alpha=0.2)
pcmci_parcorr.print_significant_links(
        p_matrix = results['p_matrix'], 
        val_matrix = results['val_matrix'],
        alpha_level = 0.01)


# Initialize conditional independence test

Parameters:
independence test = par_corr
significance = analytic

##
## Running Tigramite PC algorithm
##

Parameters:
independence test = par_corr
tau_min = 1
tau_max = 2
pc_alpha = 0.2
max_conds_dim = None
max_combinations = 1



## Variable $X^0$

Iterating through pc_alpha = [0.2]:

# pc_alpha = 0.2 (1/1):

Testing condition sets of dimension 0:

    Link ($X^0$ -1) --> $X^0$ (1/6):
            Constructed array of shape (2, 58) from
            X = [(0, -1)]
            Y = [(0, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00000 / val = 0.682
    No conditions of dimension 0 left.

    Link ($X^0$ -2) --> $X^0$ (2/6):
            Constructed array of shape (2, 58) from
            X = [(0, -2)]
            Y = [(0, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00002 / val = 0.532
    No conditions of dimension 0 left.

    Link ($X^1$ -1) --> $X^0$ (3/6):
            Constructed array of shape (2, 58) from
            X = [(1, -1)]
            Y = [(0, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00419 / val = 0.371
    No conditions of dimension 0 left.

    Link ($X^1$ -2) --> $X^0$ (4/6):
            Constructed array of shape (2, 58) from
            X = [(1, -2)]
            Y = [(0, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.19048 / val = 0.174
    No conditions of dimension 0 left.

    Link ($X^2$ -1) --> $X^0$ (5/6):
            Constructed array of shape (2, 58) from
            X = [(2, -1)]
            Y = [(0, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.99720 / val = 0.000
    Non-significance detected.

    Link ($X^2$ -2) --> $X^0$ (6/6):
            Constructed array of shape (2, 58) from
            X = [(2, -2)]
            Y = [(0, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.75189 / val = -0.042
    Non-significance detected.

    Sorting parents in decreasing order with 
    weight(i-tau->j) = min_{iterations} |I_{ij}(tau)| 

Updating parents:

    Variable $X^0$ has 4 parent(s):
        ($X^0$ -1): max_pval = 0.00000, min_val = 0.682
        ($X^0$ -2): max_pval = 0.00002, min_val = 0.532
        ($X^1$ -1): max_pval = 0.00419, min_val = 0.371
        ($X^1$ -2): max_pval = 0.19048, min_val = 0.174

Testing condition sets of dimension 1:

    Link ($X^0$ -1) --> $X^0$ (1/4):
            Constructed array of shape (3, 58) from
            X = [(0, -1)]
            Y = [(0, 0)]
            Z = [(0, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^0$ -2)  --> pval = 0.00006 / val = 0.507
    No conditions of dimension 1 left.

    Link ($X^0$ -2) --> $X^0$ (2/4):
            Constructed array of shape (3, 58) from
            X = [(0, -2)]
            Y = [(0, 0)]
            Z = [(0, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^0$ -1)  --> pval = 0.61423 / val = 0.068
    Non-significance detected.

    Link ($X^1$ -1) --> $X^0$ (3/4):
            Constructed array of shape (3, 58) from
            X = [(1, -1)]
            Y = [(0, 0)]
            Z = [(0, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^0$ -1)  --> pval = 0.78960 / val = -0.036
    Non-significance detected.

    Link ($X^1$ -2) --> $X^0$ (4/4):
            Constructed array of shape (3, 58) from
            X = [(1, -2)]
            Y = [(0, 0)]
            Z = [(0, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^0$ -1)  --> pval = 0.23381 / val = -0.160
    Non-significance detected.

    Sorting parents in decreasing order with 
    weight(i-tau->j) = min_{iterations} |I_{ij}(tau)| 

Updating parents:

    Variable $X^0$ has 1 parent(s):
        ($X^0$ -1): max_pval = 0.00006, min_val = 0.507

Algorithm converged for variable $X^0$

## Variable $X^1$

Iterating through pc_alpha = [0.2]:

# pc_alpha = 0.2 (1/1):

Testing condition sets of dimension 0:

    Link ($X^0$ -1) --> $X^1$ (1/6):
            Constructed array of shape (2, 58) from
            X = [(0, -1)]
            Y = [(1, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00000 / val = 0.800
    No conditions of dimension 0 left.

    Link ($X^0$ -2) --> $X^1$ (2/6):
            Constructed array of shape (2, 58) from
            X = [(0, -2)]
            Y = [(1, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00000 / val = 0.720
    No conditions of dimension 0 left.

    Link ($X^1$ -1) --> $X^1$ (3/6):
            Constructed array of shape (2, 58) from
            X = [(1, -1)]
            Y = [(1, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00000 / val = 0.734
    No conditions of dimension 0 left.

    Link ($X^1$ -2) --> $X^1$ (4/6):
            Constructed array of shape (2, 58) from
            X = [(1, -2)]
            Y = [(1, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00003 / val = 0.521
    No conditions of dimension 0 left.

    Link ($X^2$ -1) --> $X^1$ (5/6):
            Constructed array of shape (2, 58) from
            X = [(2, -1)]
            Y = [(1, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.09525 / val = 0.221
    No conditions of dimension 0 left.

    Link ($X^2$ -2) --> $X^1$ (6/6):
            Constructed array of shape (2, 58) from
            X = [(2, -2)]
            Y = [(1, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.92663 / val = 0.012
    Non-significance detected.

    Sorting parents in decreasing order with 
    weight(i-tau->j) = min_{iterations} |I_{ij}(tau)| 

Updating parents:

    Variable $X^1$ has 5 parent(s):
        ($X^0$ -1): max_pval = 0.00000, min_val = 0.800
        ($X^1$ -1): max_pval = 0.00000, min_val = 0.734
        ($X^0$ -2): max_pval = 0.00000, min_val = 0.720
        ($X^1$ -2): max_pval = 0.00003, min_val = 0.521
        ($X^2$ -1): max_pval = 0.09525, min_val = 0.221

Testing condition sets of dimension 1:

    Link ($X^0$ -1) --> $X^1$ (1/5):
            Constructed array of shape (3, 58) from
            X = [(0, -1)]
            Y = [(1, 0)]
            Z = [(1, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -1)  --> pval = 0.00000 / val = 0.680
    No conditions of dimension 1 left.

    Link ($X^1$ -1) --> $X^1$ (2/5):
            Constructed array of shape (3, 58) from
            X = [(1, -1)]
            Y = [(1, 0)]
            Z = [(0, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^0$ -1)  --> pval = 0.00001 / val = 0.557
    No conditions of dimension 1 left.

    Link ($X^0$ -2) --> $X^1$ (3/5):
            Constructed array of shape (3, 58) from
            X = [(0, -2)]
            Y = [(1, 0)]
            Z = [(0, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^0$ -1)  --> pval = 0.01177 / val = 0.332
    No conditions of dimension 1 left.

    Link ($X^1$ -2) --> $X^1$ (4/5):
            Constructed array of shape (3, 58) from
            X = [(1, -2)]
            Y = [(1, 0)]
            Z = [(0, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^0$ -1)  --> pval = 0.00762 / val = 0.350
    No conditions of dimension 1 left.

    Link ($X^2$ -1) --> $X^1$ (5/5):
            Constructed array of shape (3, 58) from
            X = [(2, -1)]
            Y = [(1, 0)]
            Z = [(0, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^0$ -1)  --> pval = 0.53826 / val = 0.083
    Non-significance detected.

    Sorting parents in decreasing order with 
    weight(i-tau->j) = min_{iterations} |I_{ij}(tau)| 

Updating parents:

    Variable $X^1$ has 4 parent(s):
        ($X^0$ -1): max_pval = 0.00000, min_val = 0.680
        ($X^1$ -1): max_pval = 0.00001, min_val = 0.557
        ($X^1$ -2): max_pval = 0.00762, min_val = 0.350
        ($X^0$ -2): max_pval = 0.01177, min_val = 0.332

Testing condition sets of dimension 2:

    Link ($X^0$ -1) --> $X^1$ (1/4):
            Constructed array of shape (4, 58) from
            X = [(0, -1)]
            Y = [(1, 0)]
            Z = [(1, -1), (1, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -1) ($X^1$ -2)  --> pval = 0.00000 / val = 0.680
    Still conditions of dimension 2 left, but q_max = 1 reached.

    Link ($X^1$ -1) --> $X^1$ (2/4):
            Constructed array of shape (4, 58) from
            X = [(1, -1)]
            Y = [(1, 0)]
            Z = [(0, -1), (1, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^0$ -1) ($X^1$ -2)  --> pval = 0.00032 / val = 0.463
    Still conditions of dimension 2 left, but q_max = 1 reached.

    Link ($X^1$ -2) --> $X^1$ (3/4):
            Constructed array of shape (4, 58) from
            X = [(1, -2)]
            Y = [(1, 0)]
            Z = [(0, -1), (1, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^0$ -1) ($X^1$ -1)  --> pval = 0.93261 / val = 0.012
    Non-significance detected.

    Link ($X^0$ -2) --> $X^1$ (4/4):
            Constructed array of shape (4, 58) from
            X = [(0, -2)]
            Y = [(1, 0)]
            Z = [(0, -1), (1, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^0$ -1) ($X^1$ -1)  --> pval = 0.97020 / val = 0.005
    Non-significance detected.

    Sorting parents in decreasing order with 
    weight(i-tau->j) = min_{iterations} |I_{ij}(tau)| 

Updating parents:

    Variable $X^1$ has 2 parent(s):
        ($X^0$ -1): max_pval = 0.00000, min_val = 0.680
        ($X^1$ -1): max_pval = 0.00032, min_val = 0.463

Algorithm converged for variable $X^1$

## Variable $X^2$

Iterating through pc_alpha = [0.2]:

# pc_alpha = 0.2 (1/1):

Testing condition sets of dimension 0:

    Link ($X^0$ -1) --> $X^2$ (1/6):
            Constructed array of shape (2, 58) from
            X = [(0, -1)]
            Y = [(2, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00293 / val = 0.384
    No conditions of dimension 0 left.

    Link ($X^0$ -2) --> $X^2$ (2/6):
            Constructed array of shape (2, 58) from
            X = [(0, -2)]
            Y = [(2, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00000 / val = 0.607
    No conditions of dimension 0 left.

    Link ($X^1$ -1) --> $X^2$ (3/6):
            Constructed array of shape (2, 58) from
            X = [(1, -1)]
            Y = [(2, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00000 / val = 0.771
    No conditions of dimension 0 left.

    Link ($X^1$ -2) --> $X^2$ (4/6):
            Constructed array of shape (2, 58) from
            X = [(1, -2)]
            Y = [(2, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00000 / val = 0.773
    No conditions of dimension 0 left.

    Link ($X^2$ -1) --> $X^2$ (5/6):
            Constructed array of shape (2, 58) from
            X = [(2, -1)]
            Y = [(2, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00000 / val = 0.737
    No conditions of dimension 0 left.

    Link ($X^2$ -2) --> $X^2$ (6/6):
            Constructed array of shape (2, 58) from
            X = [(2, -2)]
            Y = [(2, 0)]
            Z = []
            with missing values = 999.0 removed
    Combination 0:  --> pval = 0.00000 / val = 0.607
    No conditions of dimension 0 left.

    Sorting parents in decreasing order with 
    weight(i-tau->j) = min_{iterations} |I_{ij}(tau)| 

Updating parents:

    Variable $X^2$ has 6 parent(s):
        ($X^1$ -2): max_pval = 0.00000, min_val = 0.773
        ($X^1$ -1): max_pval = 0.00000, min_val = 0.771
        ($X^2$ -1): max_pval = 0.00000, min_val = 0.737
        ($X^2$ -2): max_pval = 0.00000, min_val = 0.607
        ($X^0$ -2): max_pval = 0.00000, min_val = 0.607
        ($X^0$ -1): max_pval = 0.00293, min_val = 0.384

Testing condition sets of dimension 1:

    Link ($X^1$ -2) --> $X^2$ (1/6):
            Constructed array of shape (3, 58) from
            X = [(1, -2)]
            Y = [(2, 0)]
            Z = [(1, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -1)  --> pval = 0.00004 / val = 0.518
    No conditions of dimension 1 left.

    Link ($X^1$ -1) --> $X^2$ (2/6):
            Constructed array of shape (3, 58) from
            X = [(1, -1)]
            Y = [(2, 0)]
            Z = [(1, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -2)  --> pval = 0.00005 / val = 0.513
    No conditions of dimension 1 left.

    Link ($X^2$ -1) --> $X^2$ (3/6):
            Constructed array of shape (3, 58) from
            X = [(2, -1)]
            Y = [(2, 0)]
            Z = [(1, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -2)  --> pval = 0.00836 / val = 0.346
    No conditions of dimension 1 left.

    Link ($X^2$ -2) --> $X^2$ (4/6):
            Constructed array of shape (3, 58) from
            X = [(2, -2)]
            Y = [(2, 0)]
            Z = [(1, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -2)  --> pval = 0.00065 / val = 0.438
    No conditions of dimension 1 left.

    Link ($X^0$ -2) --> $X^2$ (5/6):
            Constructed array of shape (3, 58) from
            X = [(0, -2)]
            Y = [(2, 0)]
            Z = [(1, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -2)  --> pval = 0.00660 / val = 0.356
    No conditions of dimension 1 left.

    Link ($X^0$ -1) --> $X^2$ (6/6):
            Constructed array of shape (3, 58) from
            X = [(0, -1)]
            Y = [(2, 0)]
            Z = [(1, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -2)  --> pval = 0.40395 / val = 0.113
    Non-significance detected.

    Sorting parents in decreasing order with 
    weight(i-tau->j) = min_{iterations} |I_{ij}(tau)| 

Updating parents:

    Variable $X^2$ has 5 parent(s):
        ($X^1$ -2): max_pval = 0.00004, min_val = 0.518
        ($X^1$ -1): max_pval = 0.00005, min_val = 0.513
        ($X^2$ -2): max_pval = 0.00065, min_val = 0.438
        ($X^0$ -2): max_pval = 0.00660, min_val = 0.356
        ($X^2$ -1): max_pval = 0.00836, min_val = 0.346

Testing condition sets of dimension 2:

    Link ($X^1$ -2) --> $X^2$ (1/5):
            Constructed array of shape (4, 58) from
            X = [(1, -2)]
            Y = [(2, 0)]
            Z = [(1, -1), (2, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -1) ($X^2$ -2)  --> pval = 0.03298 / val = 0.285
    Still conditions of dimension 2 left, but q_max = 1 reached.

    Link ($X^1$ -1) --> $X^2$ (2/5):
            Constructed array of shape (4, 58) from
            X = [(1, -1)]
            Y = [(2, 0)]
            Z = [(1, -2), (2, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -2) ($X^2$ -2)  --> pval = 0.00000 / val = 0.687
    Still conditions of dimension 2 left, but q_max = 1 reached.

    Link ($X^2$ -2) --> $X^2$ (3/5):
            Constructed array of shape (4, 58) from
            X = [(2, -2)]
            Y = [(2, 0)]
            Z = [(1, -2), (1, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -2) ($X^1$ -1)  --> pval = 0.00000 / val = 0.649
    Still conditions of dimension 2 left, but q_max = 1 reached.

    Link ($X^0$ -2) --> $X^2$ (4/5):
            Constructed array of shape (4, 58) from
            X = [(0, -2)]
            Y = [(2, 0)]
            Z = [(1, -2), (1, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -2) ($X^1$ -1)  --> pval = 0.67377 / val = 0.058
    Non-significance detected.

    Link ($X^2$ -1) --> $X^2$ (5/5):
            Constructed array of shape (4, 58) from
            X = [(2, -1)]
            Y = [(2, 0)]
            Z = [(1, -2), (1, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -2) ($X^1$ -1)  --> pval = 0.00002 / val = 0.531
    Still conditions of dimension 2 left, but q_max = 1 reached.

    Sorting parents in decreasing order with 
    weight(i-tau->j) = min_{iterations} |I_{ij}(tau)| 

Updating parents:

    Variable $X^2$ has 4 parent(s):
        ($X^1$ -1): max_pval = 0.00005, min_val = 0.513
        ($X^2$ -2): max_pval = 0.00065, min_val = 0.438
        ($X^2$ -1): max_pval = 0.00836, min_val = 0.346
        ($X^1$ -2): max_pval = 0.03298, min_val = 0.285

Testing condition sets of dimension 3:

    Link ($X^1$ -1) --> $X^2$ (1/4):
            Constructed array of shape (5, 58) from
            X = [(1, -1)]
            Y = [(2, 0)]
            Z = [(2, -2), (2, -1), (1, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^2$ -2) ($X^2$ -1) ($X^1$ -2)  --> pval = 0.00000 / val = 0.695
    Still conditions of dimension 3 left, but q_max = 1 reached.

    Link ($X^2$ -2) --> $X^2$ (2/4):
            Constructed array of shape (5, 58) from
            X = [(2, -2)]
            Y = [(2, 0)]
            Z = [(1, -1), (2, -1), (1, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -1) ($X^2$ -1) ($X^1$ -2)  --> pval = 0.00039 / val = 0.461
    Still conditions of dimension 3 left, but q_max = 1 reached.

    Link ($X^2$ -1) --> $X^2$ (3/4):
            Constructed array of shape (5, 58) from
            X = [(2, -1)]
            Y = [(2, 0)]
            Z = [(1, -1), (2, -2), (1, -2)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -1) ($X^2$ -2) ($X^1$ -2)  --> pval = 0.25270 / val = 0.157
    Non-significance detected.

    Link ($X^1$ -2) --> $X^2$ (4/4):
            Constructed array of shape (5, 58) from
            X = [(1, -2)]
            Y = [(2, 0)]
            Z = [(1, -1), (2, -2), (2, -1)]
            with missing values = 999.0 removed
    Combination 0: ($X^1$ -1) ($X^2$ -2) ($X^2$ -1)  --> pval = 0.33333 / val = 0.133
    Non-significance detected.

    Sorting parents in decreasing order with 
    weight(i-tau->j) = min_{iterations} |I_{ij}(tau)| 

Updating parents:

    Variable $X^2$ has 2 parent(s):
        ($X^1$ -1): max_pval = 0.00005, min_val = 0.513
        ($X^2$ -2): max_pval = 0.00065, min_val = 0.438

Algorithm converged for variable $X^2$

## Resulting condition sets:

    Variable $X^0$ has 1 parent(s):
        ($X^0$ -1): max_pval = 0.00006, min_val = 0.507

    Variable $X^1$ has 2 parent(s):
        ($X^0$ -1): max_pval = 0.00000, min_val = 0.680
        ($X^1$ -1): max_pval = 0.00032, min_val = 0.463

    Variable $X^2$ has 2 parent(s):
        ($X^1$ -1): max_pval = 0.00005, min_val = 0.513
        ($X^2$ -2): max_pval = 0.00065, min_val = 0.438

##
## Running Tigramite MCI algorithm
##

Parameters:

independence test = par_corr
tau_min = 0
tau_max = 2
max_conds_py = None
max_conds_px = None

        link ($X^0$ -1) --> $X^0$ (1/8):
        with conds_y = [ ]
        with conds_x = [ ($X^0$ -2) ]
            Constructed array of shape (3, 58) from
            X = [(0, -1)]
            Y = [(0, 0)]
            Z = [(0, -2)]
            with missing values = 999.0 removed
        pval = 0.00006 | val = 0.507

        link ($X^0$ -2) --> $X^0$ (2/8):
        with conds_y = [ ($X^0$ -1) ]
        with conds_x = [ ($X^0$ -3) ]
            Constructed array of shape (4, 58) from
            X = [(0, -2)]
            Y = [(0, 0)]
            Z = [(0, -1), (0, -3)]
            with missing values = 999.0 removed
        pval = 0.55892 | val = 0.080

        link ($X^1$ 0) --> $X^0$ (3/8):
        with conds_y = [ ($X^0$ -1) ]
        with conds_x = [ ($X^0$ -1) ($X^1$ -1) ]
            Constructed array of shape (4, 58) from
            X = [(1, 0)]
            Y = [(0, 0)]
            Z = [(0, -1), (1, -1)]
            with missing values = 999.0 removed
        pval = 0.60886 | val = 0.070

        link ($X^1$ -1) --> $X^0$ (4/8):
        with conds_y = [ ($X^0$ -1) ]
        with conds_x = [ ($X^0$ -2) ($X^1$ -2) ]
            Constructed array of shape (5, 58) from
            X = [(1, -1)]
            Y = [(0, 0)]
            Z = [(0, -1), (0, -2), (1, -2)]
            with missing values = 999.0 removed
        pval = 0.92609 | val = 0.013

        link ($X^1$ -2) --> $X^0$ (5/8):
        with conds_y = [ ($X^0$ -1) ]
        with conds_x = [ ($X^0$ -3) ($X^1$ -3) ]
            Constructed array of shape (5, 58) from
            X = [(1, -2)]
            Y = [(0, 0)]
            Z = [(0, -1), (0, -3), (1, -3)]
            with missing values = 999.0 removed
        pval = 0.40860 | val = -0.114

        link ($X^2$ 0) --> $X^0$ (6/8):
        with conds_y = [ ($X^0$ -1) ]
        with conds_x = [ ($X^1$ -1) ($X^2$ -2) ]
            Constructed array of shape (5, 58) from
            X = [(2, 0)]
            Y = [(0, 0)]
            Z = [(0, -1), (1, -1), (2, -2)]
            with missing values = 999.0 removed
        pval = 0.44605 | val = -0.105

        link ($X^2$ -1) --> $X^0$ (7/8):
        with conds_y = [ ($X^0$ -1) ]
        with conds_x = [ ($X^1$ -2) ($X^2$ -3) ]
            Constructed array of shape (5, 58) from
            X = [(2, -1)]
            Y = [(0, 0)]
            Z = [(0, -1), (1, -2), (2, -3)]
            with missing values = 999.0 removed
        pval = 0.34628 | val = -0.129

        link ($X^2$ -2) --> $X^0$ (8/8):
        with conds_y = [ ($X^0$ -1) ]
        with conds_x = [ ($X^1$ -3) ($X^2$ -4) ]
            Constructed array of shape (5, 58) from
            X = [(2, -2)]
            Y = [(0, 0)]
            Z = [(0, -1), (1, -3), (2, -4)]
            with missing values = 999.0 removed
        pval = 0.86467 | val = -0.024

        link ($X^0$ 0) --> $X^1$ (1/8):
        with conds_y = [ ($X^0$ -1) ($X^1$ -1) ]
        with conds_x = [ ($X^0$ -1) ]
            Constructed array of shape (4, 58) from
            X = [(0, 0)]
            Y = [(1, 0)]
            Z = [(0, -1), (1, -1)]
            with missing values = 999.0 removed
        pval = 0.60886 | val = 0.070

        link ($X^0$ -1) --> $X^1$ (2/8):
        with conds_y = [ ($X^1$ -1) ]
        with conds_x = [ ($X^0$ -2) ]
            Constructed array of shape (4, 58) from
            X = [(0, -1)]
            Y = [(1, 0)]
            Z = [(1, -1), (0, -2)]
            with missing values = 999.0 removed
        pval = 0.00000 | val = 0.610

        link ($X^0$ -2) --> $X^1$ (3/8):
        with conds_y = [ ($X^0$ -1) ($X^1$ -1) ]
        with conds_x = [ ($X^0$ -3) ]
            Constructed array of shape (5, 58) from
            X = [(0, -2)]
            Y = [(1, 0)]
            Z = [(0, -1), (1, -1), (0, -3)]
            with missing values = 999.0 removed
        pval = 0.65340 | val = 0.062

        link ($X^1$ -1) --> $X^1$ (4/8):
        with conds_y = [ ($X^0$ -1) ]
        with conds_x = [ ($X^0$ -2) ($X^1$ -2) ]
            Constructed array of shape (5, 58) from
            X = [(1, -1)]
            Y = [(1, 0)]
            Z = [(0, -1), (0, -2), (1, -2)]
            with missing values = 999.0 removed
        pval = 0.00167 | val = 0.414

        link ($X^1$ -2) --> $X^1$ (5/8):
        with conds_y = [ ($X^0$ -1) ($X^1$ -1) ]
        with conds_x = [ ($X^0$ -3) ($X^1$ -3) ]
            Constructed array of shape (6, 58) from
            X = [(1, -2)]
            Y = [(1, 0)]
            Z = [(0, -1), (1, -1), (0, -3), (1, -3)]
            with missing values = 999.0 removed
        pval = 0.05226 | val = 0.266

        link ($X^2$ 0) --> $X^1$ (6/8):
        with conds_y = [ ($X^0$ -1) ($X^1$ -1) ]
        with conds_x = [ ($X^1$ -1) ($X^2$ -2) ]
            Constructed array of shape (5, 58) from
            X = [(2, 0)]
            Y = [(1, 0)]
            Z = [(0, -1), (1, -1), (2, -2)]
            with missing values = 999.0 removed
        pval = 0.21886 | val = 0.168

        link ($X^2$ -1) --> $X^1$ (7/8):
        with conds_y = [ ($X^0$ -1) ($X^1$ -1) ]
        with conds_x = [ ($X^1$ -2) ($X^2$ -3) ]
            Constructed array of shape (6, 58) from
            X = [(2, -1)]
            Y = [(1, 0)]
            Z = [(0, -1), (1, -1), (1, -2), (2, -3)]
            with missing values = 999.0 removed
        pval = 0.11950 | val = -0.214

        link ($X^2$ -2) --> $X^1$ (8/8):
        with conds_y = [ ($X^0$ -1) ($X^1$ -1) ]
        with conds_x = [ ($X^1$ -3) ($X^2$ -4) ]
            Constructed array of shape (6, 58) from
            X = [(2, -2)]
            Y = [(1, 0)]
            Z = [(0, -1), (1, -1), (1, -3), (2, -4)]
            with missing values = 999.0 removed
        pval = 0.97732 | val = 0.004

        link ($X^0$ 0) --> $X^2$ (1/8):
        with conds_y = [ ($X^1$ -1) ($X^2$ -2) ]
        with conds_x = [ ($X^0$ -1) ]
            Constructed array of shape (5, 58) from
            X = [(0, 0)]
            Y = [(2, 0)]
            Z = [(1, -1), (2, -2), (0, -1)]
            with missing values = 999.0 removed
        pval = 0.44605 | val = -0.105

        link ($X^0$ -1) --> $X^2$ (2/8):
        with conds_y = [ ($X^1$ -1) ($X^2$ -2) ]
        with conds_x = [ ($X^0$ -2) ]
            Constructed array of shape (5, 58) from
            X = [(0, -1)]
            Y = [(2, 0)]
            Z = [(1, -1), (2, -2), (0, -2)]
            with missing values = 999.0 removed
        pval = 0.33813 | val = -0.132

        link ($X^0$ -2) --> $X^2$ (3/8):
        with conds_y = [ ($X^1$ -1) ($X^2$ -2) ]
        with conds_x = [ ($X^0$ -3) ]
            Constructed array of shape (5, 58) from
            X = [(0, -2)]
            Y = [(2, 0)]
            Z = [(1, -1), (2, -2), (0, -3)]
            with missing values = 999.0 removed
        pval = 0.12526 | val = -0.209

        link ($X^1$ 0) --> $X^2$ (4/8):
        with conds_y = [ ($X^1$ -1) ($X^2$ -2) ]
        with conds_x = [ ($X^0$ -1) ($X^1$ -1) ]
            Constructed array of shape (5, 58) from
            X = [(1, 0)]
            Y = [(2, 0)]
            Z = [(1, -1), (2, -2), (0, -1)]
            with missing values = 999.0 removed
        pval = 0.21886 | val = 0.168

        link ($X^1$ -1) --> $X^2$ (5/8):
        with conds_y = [ ($X^2$ -2) ]
        with conds_x = [ ($X^0$ -2) ($X^1$ -2) ]
            Constructed array of shape (5, 58) from
            X = [(1, -1)]
            Y = [(2, 0)]
            Z = [(2, -2), (0, -2), (1, -2)]
            with missing values = 999.0 removed
        pval = 0.00000 | val = 0.606

        link ($X^1$ -2) --> $X^2$ (6/8):
        with conds_y = [ ($X^1$ -1) ($X^2$ -2) ]
        with conds_x = [ ($X^0$ -3) ($X^1$ -3) ]
            Constructed array of shape (6, 58) from
            X = [(1, -2)]
            Y = [(2, 0)]
            Z = [(1, -1), (2, -2), (0, -3), (1, -3)]
            with missing values = 999.0 removed
        pval = 0.32855 | val = 0.136

        link ($X^2$ -1) --> $X^2$ (7/8):
        with conds_y = [ ($X^1$ -1) ($X^2$ -2) ]
        with conds_x = [ ($X^1$ -2) ($X^2$ -3) ]
            Constructed array of shape (6, 58) from
            X = [(2, -1)]
            Y = [(2, 0)]
            Z = [(1, -1), (2, -2), (1, -2), (2, -3)]
            with missing values = 999.0 removed
        pval = 0.54650 | val = 0.084

        link ($X^2$ -2) --> $X^2$ (8/8):
        with conds_y = [ ($X^1$ -1) ]
        with conds_x = [ ($X^1$ -3) ($X^2$ -4) ]
            Constructed array of shape (5, 58) from
            X = [(2, -2)]
            Y = [(2, 0)]
            Z = [(1, -1), (1, -3), (2, -4)]
            with missing values = 999.0 removed
        pval = 0.07281 | val = 0.244

## Significant links at alpha = 0.05:

    Variable $X^0$ has 1 link(s):
        ($X^0$ -1): pval = 0.00006 | val = 0.507 | conf = (0.000, 0.000)

    Variable $X^1$ has 2 link(s):
        ($X^0$ -1): pval = 0.00000 | val = 0.610 | conf = (0.000, 0.000)
        ($X^1$ -1): pval = 0.00167 | val = 0.414 | conf = (0.000, 0.000)

    Variable $X^2$ has 1 link(s):
        ($X^1$ -1): pval = 0.00000 | val = 0.606 | conf = (0.000, 0.000)

## Significant links at alpha = 0.01:

    Variable $X^0$ has 1 link(s):
        ($X^0$ -1): pval = 0.00006 | val = 0.507

    Variable $X^1$ has 2 link(s):
        ($X^0$ -1): pval = 0.00000 | val = 0.610
        ($X^1$ -1): pval = 0.00167 | val = 0.414

    Variable $X^2$ has 1 link(s):
        ($X^1$ -1): pval = 0.00000 | val = 0.606

Masking

Different from missing values, masking can be used to include or exclude samples depending on the situation: For example, in climate research we frequently are interested to detect the drivers of a target variable only in winter months. Suppose we are given dataseries ($X^1, X^2, X^3$) and want to estimate the causal parents affecting the variables in winter months. This can be achieved with mask_type='y' in initializing ParCorr and marking all winter month data in ($X^1, X^2, X^3$) in mask with a 1 (or True). During a PCMCI analysis many independent tests $X_{t-\tau} \perp Y_t | Z$ are carried out, where $X,Y,Z$ denote any of the $X^1,X^2,X^3$ in different iterations of the PCMCI analysis. For example, to estimate the parents of $X^2$, the argument $X_{t-\tau}$ as well as $Z$ will iterate through ($X^1, X^2, X^3$) including their lags. Then Tigramite considers samples of $Y=X^2$ from the winter only using the mask of $X^2$, while lagged samples of $X$ or $Z$ can also come, e.g., from the previous summer. If we want all samples, also in $X$ and $Z$ to be restricted to the respective mask, we need to mark them in mask as well and set mask_type='yxz'. Correspondingly, also mask_type='z' or any combination is possible. Missing values and masking will be described in more detail in a future paper.

In the following example, we generate data with a different underlying causality for winter and summer months. In particular, assume a causal effect is of opposite sign in both seasons.


In [4]:
# Masking demo: We consider time series where the one part is generated by a different
# causal process than the other part. 
np.random.seed(42)
T = 1000
data = np.random.randn(T, 2)
data_mask = np.zeros(data.shape)
for t in range(1, T):
#     print t % 365
    if (t % 365) < 3*30 or (t % 365) > 8*30: 
        # Winter half year
        data[t, 0] +=  0.4*data[t-1, 0]
        data[t, 1] +=  0.3*data[t-1, 1] + 0.9*data[t-1, 0]
    else:
        # Summer half year
        data_mask[[t, t-1]] = True
        data[t, 0] +=  0.4*data[t-1, 0]
        data[t, 1] +=  0.3*data[t-1, 1] - 0.9*data[t-1, 0]

T, N = data.shape
# print data_mask[:100, 0]
dataframe = pp.DataFrame(data, mask=data_mask)
tp.plot_timeseries(dataframe, figsize=(8,3), use_mask=True, grey_masked_samples='data')


Out[4]:
(<Figure size 576x216 with 2 Axes>,
 array([<matplotlib.axes._subplots.AxesSubplot object at 0x7fdd3db89208>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7fdd3c34a518>],
       dtype=object))

In [5]:
# Setup analysis
def run_and_plot(cond_ind_test, fig_ax):
    pcmci = PCMCI(dataframe=dataframe, cond_ind_test=cond_ind_test)
    results = pcmci.run_pcmci(tau_max=2,pc_alpha=0.2, )
    link_matrix = pcmci.return_significant_parents(pq_matrix=results['p_matrix'],
            val_matrix=results['val_matrix'], alpha_level=0.01)['link_matrix']
    tp.plot_graph(fig_ax = fig_ax,  val_matrix=results['val_matrix'],
                  link_matrix=link_matrix, var_names=var_names,
    )

In [6]:
# Causal graph of whole year yields no link because effects average out
fig  = plt.figure(figsize=(3,2)); ax=fig.add_subplot(111)
run_and_plot(ParCorr(mask_type=None), (fig, ax))

# # Causal graph of winter half only gives positive link
fig  = plt.figure(figsize=(3,2)); ax=fig.add_subplot(111)
run_and_plot(ParCorr(mask_type='y'), (fig, ax))

# Causal graph of summer half only gives negative link
fig  = plt.figure(figsize=(3,2)); ax=fig.add_subplot(111)
dataframe.mask = (dataframe.mask == False)
run_and_plot(ParCorr(mask_type='y'),  (fig, ax))


Note, however, that the failure to detect the link on the whole sample occurs only for partial correlatiol because the positive and negative dependencies cancel out. Using CMIknn recovers the link (but gets a false positive for this realization):


In [7]:
pcmci = PCMCI(dataframe=dataframe, cond_ind_test=CMIknn(mask_type=None, transform='ranks'))
results = pcmci.run_pcmci(tau_max=2,pc_alpha=0.2)
link_matrix = pcmci.return_significant_parents(pq_matrix=results['p_matrix'],
        val_matrix=results['val_matrix'], alpha_level=0.01)['link_matrix']
fig  = plt.figure(figsize=(3,2)); ax=fig.add_subplot(111)
tp.plot_graph(fig_ax = (fig, ax),  val_matrix=results['val_matrix'],
              link_matrix=link_matrix, var_names=var_names)


Out[7]:
(<Figure size 216x144 with 3 Axes>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7fdd35d0a160>)

In [ ]:


In [ ]: