TIGRAMITE
TIGRAMITE is a time series analysis python module. It allows to reconstruct graphical models (conditional independence graphs) from discrete or continuously-valued time series based on the PCMCI method and create high-quality plots of the results.
PCMCI is described here: J. Runge, P. Nowack, M. Kretschmer, S. Flaxman, D. Sejdinovic, Detecting and quantifying causal associations in large nonlinear time series datasets. Sci. Adv. 5, eaau4996 (2019) https://advances.sciencemag.org/content/5/11/eaau4996
This tutorial explains the causal assumptions and gives walk-through examples. See the following paper for theoretical background: Runge, Jakob. 2018. “Causal Network Reconstruction from Time Series: From Theoretical Assumptions to Practical Estimation.” Chaos: An Interdisciplinary Journal of Nonlinear Science 28 (7): 075310.
Last, the following Nature Communications Perspective paper provides an overview of causal inference methods in general, identifies promising applications, and discusses methodological challenges (exemplified in Earth system sciences): https://www.nature.com/articles/s41467-019-10105-3
In [24]:
# Imports
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
%matplotlib inline
## use `%matplotlib notebook` for interactive figures
# plt.style.use('ggplot')
import sklearn
import tigramite
from tigramite import data_processing as pp
from tigramite import plotting as tp
from tigramite.pcmci import PCMCI
from tigramite.independence_tests import ParCorr, GPDC, CMIknn, CMIsymb
from tigramite.models import LinearMediation, Prediction
Having introduced the basic functionality, we now turn to a discussion of the assumptions underlying a causal interpretation:
Faithfulness / Stableness: Independencies in data arise not from coincidence, but rather from causal structure or, expressed differently, If two variables are independent given some other subset of variables, then they are not connected by a causal link in the graph.
Causal Sufficiency: Measured variables include all of the common causes.
Causal Markov Condition: All the relevant probabilistic information that can be obtained from the system is contained in its direct causes. or, expressed differently, If two variables are not connected in the causal graph given some set of conditions (see Runge Chaos 2018 for further definitions), then they are conditionally independent.
No contemporaneous effects: There are no causal effects at lag zero.
Stationarity
Parametric assumptions of independence tests (these where already discussed in basic tutorial)
Faithfulness, as stated above, is an expression of the assumption that the independencies we measure come from the causal structure, i.e., the time series graph, and cannot occur due to some fine tuning of the parameters. Another unfaithful case are processes containing purely deterministic dependencies, i.e., $Y=f(X)$, without any noise. We illustrate these cases in the following.
Suppose in our model we have two ways in which $X^0$ causes $X^2$, a direct one, and an indirect effect $X^0\to X^1 \to X^2$ as realized in the following model:
\begin{align*} X^0_t &= \eta^0_t\\ X^1_t &= 0.6 X^0_{t-1} + \eta^1_t\\ X^2_t &= 0.6 X^1_{t-1} - 0.36 X^0_{t-2} + \eta^2_t\\ \end{align*}
In [2]:
np.random.seed(1)
data = np.random.randn(500, 3)
for t in range(1, 500):
# data[t, 0] += 0.6*data[t-1, 1]
data[t, 1] += 0.6*data[t-1, 0]
data[t, 2] += 0.6*data[t-1, 1] - 0.36*data[t-2, 0]
var_names = [r'$X^0$', r'$X^1$', r'$X^2$', r'$X^3$']
dataframe = pp.DataFrame(data, var_names=var_names)
# tp.plot_timeseries(dataframe)
Since here $X^2_t = 0.6 X^1_{t-1} - 0.36 X^0_{t-2} + \eta^2_t = 0.6 (0.6 X^0_{t-2} + \eta^1_{t-1}) - 0.36 X^0_{t-2} + \eta^2_t = 0.36 X^0_{t-2} - 0.36 X^0_{t-2} + ...$, there is no unconditional dependency $X^0_{t-2} \to X^2_t$ and the link is not detected in the condition-selection step:
In [3]:
parcorr = ParCorr()
pcmci_parcorr = PCMCI(
dataframe=dataframe,
cond_ind_test=parcorr,
verbosity=1)
all_parents = pcmci_parcorr.run_pc_stable(tau_max=2, pc_alpha=0.2)
However, since the other parent of $X^2$, namely $X^1_{t-1}$ is detected, the MCI step conditions on $X^1_{t-1}$ and can reveal the true underlying graph (in this particular case):
In [4]:
results = pcmci_parcorr.run_pcmci(tau_max=2, pc_alpha=0.2)
pcmci_parcorr.print_significant_links(
p_matrix = results['p_matrix'],
val_matrix = results['val_matrix'],
alpha_level = 0.01)
Note, however, that this is not always the case and such cancellation, even though a pathological case, can present a problem especially for smaller sample sizes.
In [5]:
np.random.seed(1)
data = np.random.randn(500, 3)
for t in range(1, 500):
data[t, 0] = 0.4*data[t-1, 1]
data[t, 2] += 0.3*data[t-2, 1] + 0.7*data[t-1, 0]
dataframe = pp.DataFrame(data, var_names=var_names)
tp.plot_timeseries(dataframe)
Out[5]:
In [7]:
parcorr = ParCorr()
pcmci_parcorr = PCMCI(
dataframe=dataframe,
cond_ind_test=parcorr,
verbosity=2)
results = pcmci_parcorr.run_pcmci(tau_max=2, pc_alpha=0.2)
pcmci_parcorr.print_significant_links(
p_matrix = results['p_matrix'],
val_matrix = results['val_matrix'],
alpha_level = 0.01)
link_matrix = pcmci_parcorr.return_significant_parents(pq_matrix=results['p_matrix'],
val_matrix=results['val_matrix'], alpha_level=0.01)['link_matrix']
# Plot time series graph
tp.plot_time_series_graph(
figsize=(6, 3),
val_matrix=results['val_matrix'],
link_matrix=link_matrix,
var_names=var_names,
link_colorbar_label='MCI',
)
Here the partial correlation $X^1_{t-1} \to X^0_t$ is exactly 1. Since these now represent the same variable, the true link $X^0_{t-1} \to X^2_t$ cannot be detected anymore since we condition on $X^1_{t-2}$. Deterministic copies of other variables should be excluded from the analysis.
Causal sufficiency demands that the set of variables contains all common causes of any two variables. This assumption is mostly violated when analyzing open complex systems outside a confined experimental setting. Any link estimated from a causal discovery algorithm could become non-significant if more variables are included in the analysis. Observational causal inference assuming causal sufficiency should generally be seen more as one step towards a physical process understanding. There exist, however, algorithms that take into account and can expclicitely represent confounded links (e.g., the FCI algorithm). Causal discovery can greatly help in an explorative model building analysis to get an idea of potential drivers. In particular, the absence of a link allows for a more robust conclusion: If there is no evidence for a statistical dependency, then a physical mechanism is less likely (assuming that the other assumptions hold).
See Runge, Jakob. 2018. “Causal Network Reconstruction from Time Series: From Theoretical Assumptions to Practical Estimation.” Chaos: An Interdisciplinary Journal of Nonlinear Science 28 (7): 075310. for alternative approaches that do not necessitate Causal Sufficiency.
For the common driver process, consider that the common driver was not measured:
In [8]:
np.random.seed(1)
data = np.random.randn(10000, 5)
a = 0.8
for t in range(5, 10000):
data[t, 0] += a*data[t-1, 0]
data[t, 1] += a*data[t-1, 1] + 0.5*data[t-1, 0]
data[t, 2] += a*data[t-1, 2] + 0.5*data[t-1, 1] + 0.5*data[t-1, 4]
data[t, 3] += a*data[t-1, 3] + 0.5*data[t-2, 4]
data[t, 4] += a*data[t-1, 4]
# tp.plot_timeseries(dataframe)
obsdata = data[:,[0, 1, 2, 3]]
var_names_lat = ['W', 'Y', 'X', 'Z', 'U']
In [9]:
for data_here in [data, obsdata]:
dataframe = pp.DataFrame(data_here)
parcorr = ParCorr()
pcmci_parcorr = PCMCI(
dataframe=dataframe,
cond_ind_test=parcorr,
verbosity=0)
results = pcmci_parcorr.run_pcmci(tau_max=5, pc_alpha=0.1)
pcmci_parcorr.print_significant_links(
p_matrix = results['p_matrix'],
val_matrix = results['val_matrix'],
alpha_level = 0.01)
link_matrix = pcmci_parcorr.return_significant_parents(pq_matrix=results['p_matrix'],
val_matrix=results['val_matrix'], alpha_level=0.001)['link_matrix']
tp.plot_graph(
val_matrix=results['val_matrix'],
link_matrix=link_matrix,
var_names=var_names_lat,
link_colorbar_label='cross-MCI',
node_colorbar_label='auto-MCI',
)
The upper plot shows the true causal graph if all variables are observed. The lower graph shows the case where variable $U$ is hidden. Then several spurious links appear: (1) $X\to Z$ and (2) links from $Y$ and $W$ to $Z$, which is counterintuitive because there is no possible indirect pathway (see upper graph). What's the reason? The culprit is the collider $X$: MCI (or FullCI and any other causal measure conditioning on the entire past) between $Y$ and $Z$ is conditioned on the parents of $Z$, which includes $X$ here in the lower latent graph. But then conditioning on a collider opens up the paths from $Y$ and $W$ to $Z$ and makes them dependent.
In a geoscientific context, the solar forcing typically is a strong common driver of many processes. To remove this trivial effect, time series are typically anomalized, that is, the average seasonal cycle is subtracted. But one could also include the solar forcing explicitely as shown here via a sine wave for an artificial example. We've also made the time series more realistic by adding an auto-dependency on their past values.
In [10]:
np.random.seed(1)
data = np.random.randn(1000, 4)
# Simple sun
data[:,3] = np.sin(np.arange(1000)*20/np.pi)
var_names[3] = 'Sun'
c = 0.8
for t in range(1, 1000):
data[t, 0] += 0.4*data[t-1, 0] + 0.4*data[t-1, 1] + c*data[t-1,3]
data[t, 1] += 0.5*data[t-1, 1] + c*data[t-1,3]
data[t, 2] += 0.6*data[t-1, 2] + 0.3*data[t-2, 1] + c*data[t-1,3]
dataframe = pp.DataFrame(data, var_names=var_names)
tp.plot_timeseries(dataframe)
Out[10]:
If we do not account for the common solar forcing, there will be many spurious links:
In [11]:
parcorr = ParCorr()
dataframe_nosun = pp.DataFrame(data[:,[0,1,2]], var_names=var_names)
pcmci_parcorr = PCMCI(
selected_variables = [0,1,2],
dataframe=dataframe_nosun,
cond_ind_test=parcorr,
verbosity=0)
results = pcmci_parcorr.run_pcmci(tau_max=2, pc_alpha=0.2)
pcmci_parcorr.print_significant_links(
p_matrix = results['p_matrix'],
val_matrix = results['val_matrix'],
alpha_level = 0.01)
However, if we explicitely include the solar forcing variable (which we assume is known in this case), we can identify the correct causal graph. Since we are not interested in the drivers of the solar forcing variable, we don't attempt to reconstruct its parents. This can be achieved by selected_variables = [0,1,2]
.
In [12]:
parcorr = ParCorr()
pcmci_parcorr = PCMCI(
selected_variables = [0,1,2],
dataframe=dataframe,
cond_ind_test=parcorr,
verbosity=0)
results = pcmci_parcorr.run_pcmci(tau_max=2, pc_alpha=0.2)
pcmci_parcorr.print_significant_links(
p_matrix = results['p_matrix'],
val_matrix = results['val_matrix'],
alpha_level = 0.01)
link_matrix = pcmci_parcorr.return_significant_parents(pq_matrix=results['p_matrix'],
val_matrix=results['val_matrix'], alpha_level=0.01)['link_matrix']
# Plot time series graph
tp.plot_time_series_graph(
figsize=(6, 3),
val_matrix=results['val_matrix'],
link_matrix=link_matrix,
var_names=var_names,
link_colorbar_label='MCI',
)
In [13]:
np.random.seed(1)
data = np.random.randn(1000, 3)
for t in range(1, 1000):
data[t, 0] += 0.*data[t-1, 0] + 0.6*data[t-1,2]
data[t, 1] += 0.*data[t-1, 1] + 0.6*data[t-1,0]
data[t, 2] += 0.*data[t-1, 2] + 0.6*data[t-1,1]
dataframe = pp.DataFrame(data, var_names=var_names)
tp.plot_timeseries(dataframe)
Out[13]:
With the original time sampling we obtain the correct causal graph:
In [14]:
pcmci_parcorr = PCMCI(dataframe=dataframe, cond_ind_test=ParCorr())
results = pcmci_parcorr.run_pcmci(tau_min=0,tau_max=2, pc_alpha=0.2)
pcmci_parcorr.print_significant_links(
p_matrix = results['p_matrix'],
val_matrix = results['val_matrix'],
alpha_level = 0.01)
In [15]:
link_matrix = pcmci_parcorr.return_significant_parents(pq_matrix=results['p_matrix'],
val_matrix=results['val_matrix'], alpha_level=0.01)['link_matrix']
# Plot time series graph
tp.plot_time_series_graph(
figsize=(6, 3),
val_matrix=results['val_matrix'],
link_matrix=link_matrix,
var_names=var_names,
link_colorbar_label='MCI',
)
If we sub-sample the data, very counter-intuitive links can appear. The true causal loop gets detected in the wrong direction:
In [16]:
sampled_data = data[::2]
pcmci_parcorr = PCMCI(dataframe=pp.DataFrame(sampled_data, var_names=var_names),
cond_ind_test=ParCorr(), verbosity=0)
results = pcmci_parcorr.run_pcmci(tau_min=0, tau_max=2, pc_alpha=0.2)
link_matrix = pcmci_parcorr.return_significant_parents(pq_matrix=results['p_matrix'],
val_matrix=results['val_matrix'], alpha_level=0.01)['link_matrix']
# Plot time series graph
tp.plot_time_series_graph(
figsize=(6, 3),
val_matrix=results['val_matrix'],
link_matrix=link_matrix,
var_names=var_names,
link_colorbar_label='MCI',
)
If causal lags are smaller than the time sampling, such problems may occur. Causal inference for sub-sampled data is still an active area of research.
The Markov condition can be rephrased as assuming that the noises driving each variable are independent of each other and independent in time (iid). This is violated in the following example where each variable is driven by 1/f noise which refers to the scaling of the power spectrum. 1/f noise can be generated by averaging AR(1) processes (http://www.scholarpedia.org/article/1/f_noise) which means that the noise is not independent in time anymore (even though the noise terms of each individual variable are still independent). Note that this constitutes a violation of the Markov Condition of the observed process only. So one might call this rather a violation of Causal Sufficiency.
In [17]:
np.random.seed(1)
T = 10000
# Generate 1/f noise by averaging AR1-process with wide range of coeffs
# (http://www.scholarpedia.org/article/1/f_noise)
def one_over_f_noise(T, n_ar=20):
whitenoise = np.random.randn(T, n_ar)
ar_coeffs = np.linspace(0.1, 0.9, n_ar)
for t in range(T):
whitenoise[t] += ar_coeffs*whitenoise[t-1]
return whitenoise.sum(axis=1)
data = np.random.randn(T, 3)
data[:,0] += one_over_f_noise(T)
data[:,1] += one_over_f_noise(T)
data[:,2] += one_over_f_noise(T)
for t in range(1, T):
data[t, 0] += 0.4*data[t-1, 1]
data[t, 2] += 0.3*data[t-2, 1]
dataframe = pp.DataFrame(data, var_names=var_names)
tp.plot_timeseries(dataframe)
# plt.psd(data[:,0],return_line=True)[2]
# plt.psd(data[:,1],return_line=True)[2]
# plt.psd(data[:,2],return_line=True)[2]
# plt.gca().set_xscale("log", nonposx='clip')
# plt.gca().set_yscale("log", nonposy='clip')
Out[17]:
Here PCMCI will detect many spurious links, especially auto-dependencies, since the process has long memory and the present state is not independent of the further past given some set of parents.
In [18]:
parcorr = ParCorr()
pcmci_parcorr = PCMCI(
dataframe=dataframe,
cond_ind_test=parcorr,
verbosity=0)
results = pcmci_parcorr.run_pcmci(tau_max=5, pc_alpha=0.2)
pcmci_parcorr.print_significant_links(
p_matrix = results['p_matrix'],
val_matrix = results['val_matrix'],
alpha_level = 0.01)
In [19]:
np.random.seed(1)
data = np.random.randn(1000, 3)
for t in range(1, 1000):
data[t, 0] += 0.7*data[t-1, 0]
data[t, 1] += 0.6*data[t-1, 1] + 0.6*data[t-1,0]
data[t, 2] += 0.5*data[t-1, 2] + 0.6*data[t-1,1]
dataframe = pp.DataFrame(data, var_names=var_names)
tp.plot_timeseries(dataframe)
Out[19]:
With the original time aggregation we obtain the correct causal graph:
In [20]:
pcmci_parcorr = PCMCI(dataframe=dataframe, cond_ind_test=ParCorr())
results = pcmci_parcorr.run_pcmci(tau_min=0,tau_max=2, pc_alpha=0.2)
pcmci_parcorr.print_significant_links(
p_matrix = results['p_matrix'],
val_matrix = results['val_matrix'],
alpha_level = 0.01)
In [21]:
link_matrix = pcmci_parcorr.return_significant_parents(pq_matrix=results['p_matrix'],
val_matrix=results['val_matrix'], alpha_level=0.01)['link_matrix']
# Plot time series graph
tp.plot_time_series_graph(
figsize=(6, 3),
val_matrix=results['val_matrix'],
link_matrix=link_matrix,
var_names=var_names,
link_colorbar_label='MCI',
)
If we aggregate the data, we also detect a contemporaneous dependency for which no causal direction can be assessed in this framework and we obtain also several lagged spurious links. Essentially, we now have direct causal effects that appear contemporaneous on the aggregated time scale. Also causal inference for time-aggregated data is still an active area of research. Note again that this constitutes a violation of the Markov Condition of the observed process only. So one might call this rather a violation of Causal Sufficiency.
In [22]:
aggregated_data = pp.time_bin_with_mask(data, time_bin_length=4)
pcmci_parcorr = PCMCI(dataframe=pp.DataFrame(aggregated_data[0], var_names=var_names), cond_ind_test=ParCorr(),
verbosity=0)
results = pcmci_parcorr.run_pcmci(tau_min=0, tau_max=2, pc_alpha=0.2)
# pcmci_parcorr._print_significant_links(
# p_matrix = results['p_matrix'],
# val_matrix = results['val_matrix'],
# alpha_level = 0.01)
In [23]:
link_matrix = pcmci_parcorr.return_significant_parents(pq_matrix=results['p_matrix'],
val_matrix=results['val_matrix'], alpha_level=0.01)['link_matrix']
# Plot time series graph
tp.plot_time_series_graph(
figsize=(6, 3),
val_matrix=results['val_matrix'],
link_matrix=link_matrix,
var_names=var_names,
link_colorbar_label='MCI',
)