Recall from last time that we would like to model RT as a sum of two random variables D and M, where
These random variables have associated probability density functions:
Then, the probability density for $RT = D+M$ is given by the convolution of $f$ and $g$, defined as:
$$ f_{EG}(t\mid \mu,\sigma,\lambda) = \int_{-\infty}^t f(x)g(t-x)dx $$Let's see that the convolution works in principle:
In [0]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import norm, expon
In [0]:
T = np.linspace(0,600,200)
plt.plot(T, norm.pdf(T, loc=300, scale=50))
plt.show()
In [0]:
plt.plot(T, expon.pdf(T, scale=100))
plt.show()
# note: scale = 1/lambda
In [0]:
f = norm.pdf(T, loc=300, scale=50)
g = expon.pdf(T, scale=100)
fg = np.convolve(f,g)
plt.plot(fg)
plt.show()
In [0]:
# let's use built in ex-Gaussian density from Python (thanks Nicholas!)
from scipy.stats import exponnorm
plt.plot(T, exponnorm.pdf(T, loc=200, scale=50, K=2))
plt.show()
Different authors will use different parameterizations of the ex-Gaussian density. I prefer using the one used by Matzke & Wagenmakers (2009), which involves the following three parameters:
The density function is given by:
$$ f(t\mid \mu, \sigma, \tau) = \frac{1}{\tau \sqrt{2\pi}}\exp\Bigl(\frac{\sigma^2}{2\tau^2}-\frac{x-\mu}{\tau}\Bigr)\int_{-\infty}^{[(x-\mu)/\sigma]-(\sigma/\tau)}\exp \Bigl( -\frac{y^2}{2}\Bigr)dy $$One advantage of this parameterization is that we have the following:
implying that the overall mean (and variance) of our RT distribution can be thought of as "decomposed" into a "normal mean" and a "tail mean"
In [0]:
# in Python, this is equivalent to the following:
#
# mu = loc
# sigma = scale
# tau = K*sigma
# we can now define the NLL for the ex-Gaussian
# first, let's load our data:
dat = pd.read_csv("https://raw.githubusercontent.com/tomfaulkenberry/courses/master/summer2019/mathpsychREU/schwarz-A.csv")
X = dat.RT
def nllEG(pars):
mu, sigma, tau = pars
ll = np.log(exponnorm.pdf(X, loc=mu, scale=sigma, K=tau/sigma))
return(-1*sum(ll))
In [0]:
# test our nllEG function
pars=np.array([200, 50, 100])
nllEG(pars)
In [0]:
# minimize the NLL against observed RTs
from scipy.optimize import minimize
from scipy.stats import skew
# init recommendations from Lacouture & Cousineau (2008)
tau_init = 0.8*X.std()
mu_init = X.mean()-skew(X)
sigma_init = np.sqrt(X.var()-tau_init**2)
# store init parameters as numpy array
inits = np.array([mu_init, sigma_init, tau_init])
mleEG = minimize(fun = nllEG,
x0 = inits,
method = 'nelder-mead')
print(mleEG)
In [0]:
# check that overall mean is decomposed into mu + tau
print(X.mean())
print(mleEG.x[0]+mleEG.x[2])
In [0]:
# lets plot against our observed data
plt.hist(X, bins=np.linspace(100, 600, 25), density=True, color='lightgray')
# extract fit and plot it
mu, sigma, tau = mleEG.x
T = np.linspace(100,600,100)
plt.plot(T, exponnorm.pdf(T, loc=mu, scale=sigma, K=tau/sigma))
plt.show()
In [0]:
# compare to normal fit from last time
# construct NLL for normal
def nllNormal(pars):
mu, sigma = pars
ll = np.log(norm.pdf(X, loc=mu, scale=sigma))
return(-1*np.sum(ll))
mu_init = X.mean()
sigma_init = X.std()
# store init parameters as numpy array
inits = np.array([mu_init, sigma_init])
mleNormal = minimize(fun = nllNormal,
x0 = inits,
method = 'nelder-mead')
plt.hist(X, bins=np.linspace(100, 600, 25), density=True, color='lightgray')
# extract normal fit and plot
mean, sd = mleNormal.x
plt.plot(T, norm.pdf(T, loc=mean, scale=sd), '--')
# extract ex-Gaussian fit and plot it
mu, sigma, tau = mleEG.x
T = np.linspace(100,600,100)
plt.plot(T, exponnorm.pdf(T, loc=mu, scale=sigma, K=tau/sigma))
plt.show()
In [0]:
# get ex-Gaussian parameters
mu, sigma, tau = mleEG.x
print(mu)
print(sigma)
print(tau)
In [0]:
# use bootstrapping to get 95% confidence interval on parameters
# first, define a function to do MLE and return parameters
def fitEG(x):
# computed nll
def nllEG(pars):
mu, sigma, tau = pars
ll = np.log(exponnorm.pdf(x, loc=mu, scale=sigma, K=tau/sigma))
return(-1*np.sum(ll))
# minimize the nll
tau_init = 0.8*x.std()
mu_init = x.mean()-skew(x)
sigma_init = np.sqrt(x.var()-tau_init**2)
# store init parameters as numpy array
inits = np.array([mu_init, sigma_init, tau_init])
mleEG = minimize(fun = nllEG,
x0 = inits,
method = 'nelder-mead')
# return fit object
return mleEG.x[0], mleEG.x[1], mleEG.x[2]
In [0]:
# now, we do the bootstrapping
# set number of bootstrap samples
N = 100
# set seed for reproducibility
np.random.seed(122)
# obtain the original model fit
mu, sigma, tau = fitEG(X)
# set empty lists to store parameter values
Mu = []
Sigma = []
Tau = []
for i in range(N):
# generate new data
Xsim = exponnorm.rvs(size=len(X), loc=mu, scale=sigma, K=tau/sigma)
# find MLE for mu, sigma, and tau
mu_bootstrap, sigma_bootstrap, tau_bootstrap = fitEG(Xsim)
# append these estimates
Mu.append(mu_bootstrap)
Sigma.append(sigma_bootstrap)
Tau.append(tau_bootstrap)
In [0]:
plt.hist(Mu, color="lightgray")
plt.show()
plt.hist(Sigma, color="lightgray")
plt.show()
plt.hist(Tau, color="lightgray")
plt.show()
In [0]:
print(np.quantile(Mu, [0.025, 0.975]))
print(np.quantile(Sigma, [0.025, 0.975]))
print(np.quantile(Tau, [0.025, 0.975]))
In [0]:
# next step in modeling -- look at effects of experimental manipulations
# Schwarz (2001) manipulated 'p_go' and 'd'
dat.head()
In [0]:
# Homework:
#
# your task is to find ex-Gaussian fits (and bootstrap confidence intervals) for each of the following conditions:
# 1. p_go = 0.5 and d = 1
# 2. p_go = 0.5 and d = 4
# 3. p_go = 0.75 and d = 1
# 4. p_go = 0.75 and d = 4
# hint: use the 'query' function from pandas
example = dat.query('p_go == 0.5')
example.head()
example.shape