Name: Navina Govindaraj
Date: April 2017
The Stroop dataset contains data from participants who were presented with a list of words, with each word displayed in a color of ink. The participant’s task was to say out loud the color of the ink in which the word was printed. The task had two conditions: a congruent words condition, and an incongruent words condition.
In [1]:
import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('whitegrid')
%pylab inline
In [2]:
stroop_data = pd.read_csv('./stroopdata.csv')
stroop_data.head()
Out[2]:
Independent variable: Treatment condition consisting of congruent and incongruent words
Dependent variable: Response time
$H_0 : \mu_C = \mu_I $ There is no difference in mean response time between the congruent and incongruent word conditions
$H_a : \mu_C \neq \mu_I $ There is a difference in mean response time between the congruent and incongruent word conditions
$\mu_C$ and $\mu_I$ denote the population means for the congruent and incongruent groups respectively.
Statistical test: Dependent t-test for paired samples is the statistical test that will be used.
This is a within-subject design, where the same subjects are being presented with two test conditions.
The reasons for choosing this test are as follows:
1) The sample size is less than 30
2) The population standard deviation is unknown
3) It is assumed that the distributions are Gaussian
In [3]:
stroop_data.describe()
Out[3]:
In [4]:
print "Median:\n", stroop_data.median()
print "\nVariance:\n", stroop_data.var()
In [5]:
fig, axs = plt.subplots(figsize=(18, 5), ncols = 3, sharey=True)
plt.figure(figsize=(8, 6))
sns.set_palette("Set2")
# Fig 1 - Congruent Words - Response Time
sns.boxplot(y="Congruent", data=stroop_data,
ax=axs[0]).set_title("Fig 1: Congruent Words - Response Time (in seconds)")
# Fig 2 - Incongruent Words - Response Time
sns.boxplot(y="Incongruent", data=stroop_data, color="coral",
ax=axs[1]).set_title("Fig 2: Incongruent Words - Response Time (in seconds)")
# Fig 3 - Congruence vs. Incongruence
sns.regplot(x="Congruent", y="Incongruent", data=stroop_data, color="m", fit_reg=False,
ax=axs[2]).set_title("Fig 3: Congruence vs. Incongruence (in seconds)")
Out[5]:
α: 0.05
Confidence level: 95%
t-critical value: 1.714
In [6]:
# Dependent t-test for paired samples
stats.ttest_rel(stroop_data["Congruent"], stroop_data["Incongruent"])
Out[6]:
6. Optional: What do you think is responsible for the effects observed? Can you think of an alternative or similar task that would result in a similar effect? Some research about the problem will be helpful for thinking about these two questions!