Detecting Change in Time Series Data v0.1


In [1]:
import numpy as np
import scipy.stats as stats
import pymc3 as pm
import matplotlib.pyplot as plt
plt.style.use(['ggplot', 'seaborn-talk'])

%matplotlib inline

In [2]:
#loading the data

NIRS_vals = []

with open("NIRS_data.txt", "r") as spfile:
    for line in spfile:
        NIRS_vals.append(np.float(line))

NIRS_vals = np.array(NIRS_vals)

In [3]:
NIRS_vals.shape[0]


Out[3]:
1178

Preliminary description of the dataset

  • The dataset consists of Near-Infrared Spectrographic (NIRS) readings of oxygenated haemoglobin concentrations tracking frontal lobe activity.

  • The data has been gathered during an attention task in ADHD adults.

  • Because of the attention task, a change is expected in the time series of NIRS values and a simple plot of the data shows it to be the case.

  • Eyeball estimates put the change to occur around around 750 unit of time.

  • We need to make an inference concerning this change point.


In [4]:
plt.figure(figsize=(17,8))
plt.scatter(np.arange(NIRS_vals.shape[0]), NIRS_vals, color='red', alpha=0.4)
plt.axvline(750, lw=10, color='dodgerblue', alpha=0.2)
plt.xlabel('Time')
plt.ylabel('NIRS value')
plt.show()


The Model

  • We take the NIRS reading to follow a Gaussian distribution with mean dependant on the time of the change that we wish to capture.
$$ \text{NIRS_values}(t) \sim \begin{cases} \text{Gaussian}(\mu_0, \sigma^2) \hspace{0.5cm} \text{if }t < \tau \\ \text{Gaussian}(\mu_1, \sigma^2) \hspace{0.5cm} \text{if }t \geq \tau \end{cases} $$
  • The two means are taken to have Gaussian priors centered around $0$ and with high variance.
$$ \mu_i \sim \text{Gaussian}(0, 1000) \hspace{0.5cm} \text{for } i = 0,1$$
  • We assume the variance of these different Gaussians to be the same.
  • The precision ($\frac{1}{\sigma^2}$) has as conjugate prior the Gamma distribution, which we set with low parameters.
$$ \frac{1}{\sigma^2} \sim \text{Gamma}(\frac{1}{1000}, \frac{1}{1000})$$
  • The time of change is taken uniformly from the whole time interval of the experiment.
$$ \tau \sim \text{Uniform}(0, t_{max})$$

In [5]:
len_time = NIRS_vals.shape[0]
time_indx = np.arange(len_time)

adhd_model = pm.Model()

with adhd_model:
    #priors
    mus_0 = pm.Normal('mu_0', 0, 1000)
    mus_1 = pm.Normal('mu_1', 0, 1000)
    
    prec = pm.Gamma('prec', alpha=0.001, beta=0.001)
    comm_dev = pm.Deterministic('st_dev', np.sqrt(1/prec))
    tau = pm.DiscreteUniform('tau', lower=0, upper=len_time)
    
    mu_s = pm.math.switch(time_indx < tau, mus_0, mus_1)
    
    #likelihood
    nirs_val = pm.Normal('NIRS_val', mu=mu_s, sd=comm_dev, observed=NIRS_vals)

In [6]:
with adhd_model:
    trace_nirs = pm.sample(10000)


Assigned NUTS to mu_0
Assigned NUTS to mu_1
Assigned NUTS to prec_log_
Assigned Metropolis to tau
100%|██████████| 10000/10000 [00:20<00:00, 479.57it/s]

In [12]:
plt.figure(figsize=(12,7))
plt.title('Histogram of the posterior of the time-change parameter')
plt.hist(trace_nirs['tau'][100:], bins=100, normed=1, color='crimson', alpha=0.7)
plt.show()



In [8]:
tau_MAP = stats.mode(trace_nirs['tau'])
print (tau_MAP)


ModeResult(mode=array([731]), count=array([3547]))

In [11]:
plt.figure(figsize=(17,8))
plt.ylim([0, NIRS_vals.max()+np.abs(NIRS_vals.min())])
colbrew = ['#e41a1c', '#984ea3', '#4daf4a', '#2b83ba']

plt.scatter(np.arange(NIRS_vals.shape[0]), NIRS_vals, color=colbrew[0], alpha=0.4)

plt.plot((0,tau_MAP[0]), 
         (NIRS_vals[:np.int(tau_MAP[0])].mean(), NIRS_vals[:np.int(tau_MAP[0])].mean()), 
         color=colbrew[1], ls='--', label='mean before change')
plt.plot((tau_MAP[0], NIRS_vals.shape[0]), 
         (NIRS_vals[np.int(tau_MAP[0]):].mean(), NIRS_vals[np.int(tau_MAP[0]):].mean()), 
         color=colbrew[2], ls='--', label='mean after change')

plt.axvline(tau_MAP[0], lw=3, color=colbrew[-1], alpha=0.7, label='Est. time of change')

plt.xlabel('Time')
plt.ylabel('NIRS value')
plt.legend()
plt.show()


Bibiliography

  • Michael D. Lee, Eric-Jan Wagenmakers, Bayesian Cognitive Modeling, Cambridge University Press, 2014

In [ ]: