Lecture 7 - The ex-Gaussian model for RTs

Recall from last time that we would like to model RT as a sum of two random variables D and M, where

  • $M \sim \text{Normal}(\mu,\sigma)$
  • $D \sim \text{Exponential}(\lambda)$

These random variables have associated probability density functions:

  • $f(t\mid \mu, \sigma) = \frac{1}{\sigma\sqrt{2\pi}} \exp\Bigl( -\frac{(t-\mu)^2}{2\sigma^2}\Bigr)$
  • $g(t\mid \lambda) = \lambda e^{-\lambda t}$

Then, the probability density for $RT = D+M$ is given by the convolution of $f$ and $g$, defined as:

$$ f_{EG}(t\mid \mu,\sigma,\lambda) = \int_{-\infty}^t f(x)g(t-x)dx $$

Let's see that the convolution works in principle:


In [0]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import norm, expon

In [0]:
T = np.linspace(0,600,200)

plt.plot(T, norm.pdf(T, loc=300, scale=50))
plt.show()

In [0]:
plt.plot(T, expon.pdf(T, scale=100))
plt.show()

# note: scale = 1/lambda

In [0]:
f = norm.pdf(T, loc=300, scale=50)
g = expon.pdf(T, scale=100)
fg = np.convolve(f,g)

plt.plot(fg)
plt.show()

In [0]:
# let's use built in ex-Gaussian density from Python (thanks Nicholas!)

from scipy.stats import exponnorm

plt.plot(T, exponnorm.pdf(T, loc=200, scale=50, K=2))
plt.show()

Note about parameterizations:

Different authors will use different parameterizations of the ex-Gaussian density. I prefer using the one used by Matzke & Wagenmakers (2009), which involves the following three parameters:

  • $\mu$ = expected value of the normal component
  • $\sigma$ = standard deviation of normal component
  • $\tau$ = expected value of the exponential (tail) component

The density function is given by:

$$ f(t\mid \mu, \sigma, \tau) = \frac{1}{\tau \sqrt{2\pi}}\exp\Bigl(\frac{\sigma^2}{2\tau^2}-\frac{x-\mu}{\tau}\Bigr)\int_{-\infty}^{[(x-\mu)/\sigma]-(\sigma/\tau)}\exp \Bigl( -\frac{y^2}{2}\Bigr)dy $$

One advantage of this parameterization is that we have the following:

  • $E(t)=\mu+\tau$
  • $Var(t) = \sigma^2+\tau^2$

implying that the overall mean (and variance) of our RT distribution can be thought of as "decomposed" into a "normal mean" and a "tail mean"


In [0]:
# in Python, this is equivalent to the following:
#
# mu = loc
# sigma = scale
# tau = K*sigma

# we can now define the NLL for the ex-Gaussian
# first, let's load our data:
dat = pd.read_csv("https://raw.githubusercontent.com/tomfaulkenberry/courses/master/summer2019/mathpsychREU/schwarz-A.csv")
X = dat.RT

def nllEG(pars):
  mu, sigma, tau = pars
  ll = np.log(exponnorm.pdf(X, loc=mu, scale=sigma, K=tau/sigma))
  return(-1*sum(ll))

In [0]:
# test our nllEG function

pars=np.array([200, 50, 100])
nllEG(pars)

In [0]:
# minimize the NLL against observed RTs

from scipy.optimize import minimize
from scipy.stats import skew

# init recommendations from Lacouture & Cousineau (2008)

tau_init = 0.8*X.std()
mu_init = X.mean()-skew(X)
sigma_init = np.sqrt(X.var()-tau_init**2)

# store init parameters as numpy array
inits = np.array([mu_init, sigma_init, tau_init])

mleEG = minimize(fun = nllEG, 
               x0 = inits,
               method = 'nelder-mead')

print(mleEG)

In [0]:
# check that overall mean is decomposed into mu + tau

print(X.mean())
print(mleEG.x[0]+mleEG.x[2])

In [0]:
# lets plot against our observed data

plt.hist(X, bins=np.linspace(100, 600, 25), density=True, color='lightgray')

# extract fit and plot it
mu, sigma, tau = mleEG.x

T = np.linspace(100,600,100)
plt.plot(T, exponnorm.pdf(T, loc=mu, scale=sigma, K=tau/sigma))
plt.show()

In [0]:
# compare to normal fit from last time

# construct NLL for normal
def nllNormal(pars):
  mu, sigma = pars
  ll = np.log(norm.pdf(X, loc=mu, scale=sigma))
  return(-1*np.sum(ll))
  
mu_init = X.mean()
sigma_init = X.std()

# store init parameters as numpy array
inits = np.array([mu_init, sigma_init])

mleNormal = minimize(fun = nllNormal, 
               x0 = inits,
               method = 'nelder-mead')


plt.hist(X, bins=np.linspace(100, 600, 25), density=True, color='lightgray')

# extract normal fit and plot
mean, sd = mleNormal.x
plt.plot(T, norm.pdf(T, loc=mean, scale=sd), '--')

# extract ex-Gaussian fit and plot it
mu, sigma, tau = mleEG.x

T = np.linspace(100,600,100)
plt.plot(T, exponnorm.pdf(T, loc=mu, scale=sigma, K=tau/sigma))
plt.show()

In [0]:
# get ex-Gaussian parameters
mu, sigma, tau = mleEG.x
print(mu)
print(sigma)
print(tau)

In [0]:
# use bootstrapping to get 95% confidence interval on parameters

# first, define a function to do MLE and return parameters

def fitEG(x):
  
  # computed nll
  def nllEG(pars):
    mu, sigma, tau = pars
    ll = np.log(exponnorm.pdf(x, loc=mu, scale=sigma, K=tau/sigma))
    return(-1*np.sum(ll))
  
  # minimize the nll
  tau_init = 0.8*x.std()
  mu_init = x.mean()-skew(x)
  sigma_init = np.sqrt(x.var()-tau_init**2)

  # store init parameters as numpy array
  inits = np.array([mu_init, sigma_init, tau_init])

  mleEG = minimize(fun = nllEG, 
                   x0 = inits,
                   method = 'nelder-mead')
  
  # return fit object
  return mleEG.x[0], mleEG.x[1], mleEG.x[2]

In [0]:
# now, we do the bootstrapping

# set number of bootstrap samples
N = 100

# set seed for reproducibility
np.random.seed(122)

# obtain the original model fit
mu, sigma, tau = fitEG(X)

# set empty lists to store parameter values
Mu = []
Sigma = []
Tau = []

for i in range(N):
  # generate new data
  Xsim = exponnorm.rvs(size=len(X), loc=mu, scale=sigma, K=tau/sigma)
  
  # find MLE for mu, sigma, and tau
  mu_bootstrap, sigma_bootstrap, tau_bootstrap = fitEG(Xsim)
  
  # append these estimates 
  Mu.append(mu_bootstrap)
  Sigma.append(sigma_bootstrap)
  Tau.append(tau_bootstrap)

In [0]:
plt.hist(Mu, color="lightgray")
plt.show()

plt.hist(Sigma, color="lightgray")
plt.show()

plt.hist(Tau, color="lightgray")
plt.show()

In [0]:
print(np.quantile(Mu, [0.025, 0.975]))
print(np.quantile(Sigma, [0.025, 0.975]))
print(np.quantile(Tau, [0.025, 0.975]))

In [0]:
# next step in modeling -- look at effects of experimental manipulations
# Schwarz (2001) manipulated 'p_go' and 'd'

dat.head()

In [0]:
# Homework:
# 
# your task is to find ex-Gaussian fits (and bootstrap confidence intervals) for each of the following conditions:
# 1. p_go = 0.5 and d = 1
# 2. p_go = 0.5 and d = 4
# 3. p_go = 0.75 and d = 1
# 4. p_go = 0.75 and d = 4

# hint:  use the 'query' function from pandas

example = dat.query('p_go == 0.5')

example.head()
example.shape