Pre-requisites

  • Install Anaconda for your platform

https://www.continuum.io/

http://jupyter.readthedocs.io/en/latest/install.html#id3

  • then the following command on the command line
jupyter notebook

Data Import

We first load pandas in order to import the csv file. We also import the seaborn and matplotlib plotting libraries.


In [17]:
%matplotlib inline

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

import scipy as scipy

sns.set_style("whitegrid")

We now import the data using read_csv()


In [2]:
k_rolls = pd.read_csv("../data_and_papers_reproducibility/Kitchen_Rolls.csv")
k_rolls


Out[2]:
ParticipantNumber Condition q1_check q2_check q1_NEO q2_NEO q3_NEO q4_NEO q5_NEO q6_NEO ... q12_NEO mean_NEO q3_check q4_check Include Rotation Age Sex Student Major.Occupation
0 1 1 2 6 4 2 1 3 4 4 ... 5 0.666667 5 5 True counter 25 M Y Rechten
1 2 2 3 5 5 3 4 4 3 5 ... 4 1.166667 8 1 True clock 20 F Y Taal- en Cultuurstudies
2 3 3 7 3 4 2 4 4 4 5 ... 4 0.833333 7 2 True counter 25 F Y Politicologie
3 4 4 4 5 3 3 2 2 2 3 ... 3 0.000000 7 4 True clock 19 F Y Psychologie
4 5 1 3 3 1 1 1 3 3 4 ... 2 -0.250000 5 2 True counter 20 F Y Geneeskunde
5 6 2 3 1 3 5 4 5 4 5 ... 2 0.833333 6 2 True clock 26 F Y Communicatiewetenschap
6 7 3 4 7 5 4 2 5 4 4 ... 4 0.666667 5 7 True counter 23 M Y Psychobiologie
7 8 4 7 2 4 5 4 3 3 3 ... 4 0.833333 7 4 True clock 22 F Y Psychologie
8 9 1 5 3 5 4 4 4 4 5 ... 4 0.916667 7 6 True counter 18 F Y x
9 10 2 4 6 4 4 2 2 4 4 ... 4 0.833333 8 2 True clock 19 F Y Sociologie
10 11 3 4 2 2 2 2 4 4 4 ... 3 0.250000 7 2 True counter 24 F Y Psychologie
11 12 4 7 2 3 1 4 5 4 4 ... 2 0.750000 8 2 True clock 20 F Y Psychobiologie
12 13 1 6 3 4 3 2 5 3 4 ... 4 0.666667 6 4 True counter 21 F Y Psychobiologie
13 14 2 3 5 4 4 2 3 3 3 ... 2 0.333333 5 5 True clock 19 F Y Psychologie
14 15 3 4 5 3 4 5 5 4 4 ... 4 0.833333 3 7 True counter 21 F N Tekenaar
15 16 4 7 4 5 2 3 4 4 3 ... 5 0.750000 7 3 True clock 19 F Y Pedagogische Wetenschappen
16 17 1 4 5 1 2 2 4 3 4 ... 3 -0.166667 7 1 True counter 21 F Y Oudheidkunde
17 18 2 0 3 4 1 1 5 2 5 ... 4 0.250000 6 4 True clock 26 F Y Bestuurskunde
18 19 3 2 2 5 3 4 4 4 4 ... 4 0.916667 5 4 True counter 21 M Y Psychologie & Wijsbegeerte
19 20 4 0 6 5 5 4 5 5 5 ... 5 1.833333 8 5 True clock 23 M Y Psychologie
20 21 1 0 5 3 3 4 4 3 3 ... 5 0.916667 6 6 True counter 41 F N Boekhouder
21 23 3 3 4 5 5 5 5 5 3 ... 5 1.666667 5 7 True counter 25 M Y History & Philosophy of Science
22 24 4 6 2 4 4 4 3 5 4 ... 4 1.083333 7 4 True clock 31 F Y Psychologie
23 25 1 3 5 1 2 3 4 2 1 ... 1 0.166667 7 2 True counter 22 F Y Algemene Sociale Wetenschappen
24 26 2 1 7 1 1 1 3 1 3 ... 1 -0.583333 7 0 True clock 19 F Y Psychologie
25 27 3 4 0 5 1 2 2 3 5 ... 5 0.666667 8 1 True counter 23 M Y Psychobiologie
26 28 4 5 7 3 3 3 2 2 4 ... 2 -0.250000 8 3 True clock 19 F Y Psychologie
27 29 1 4 4 5 5 5 4 5 5 ... 4 1.083333 6 5 True counter 19 M Y Psychobiologie
28 30 2 0 1 4 4 3 4 4 2 ... 3 0.583333 3 6 True clock 21 F Y Taal & Communicatie
29 31 3 1 7 5 3 4 4 4 5 ... 4 1.083333 3 8 True counter 21 F Y Psychobiologie & Wijsbegeerte
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
72 79 3 3 3 4 3 1 3 4 4 ... 4 0.666667 7 1 True counter 18 F Y Psychologie
73 80 4 2 2 5 4 4 4 4 4 ... 4 1.166667 7 2 True clock 24 M Y Spaanse Taal & Cultuur
74 81 1 1 8 4 2 4 2 2 4 ... 3 0.416667 4 3 True counter 20 F Y Communicatiewetenschap
75 82 2 5 1 5 5 5 4 3 4 ... 4 1.583333 7 4 True clock 19 F Y Geschiedenis
76 83 3 4 3 4 4 2 2 2 4 ... 3 0.250000 8 3 True counter 27 F N x
77 85 1 3 4 2 2 1 4 1 3 ... 2 -0.250000 7 5 True counter 21 F Y Biomedical Sciences
78 86 2 2 2 4 3 2 2 3 3 ... 4 0.500000 8 2 True clock 22 F Y Psychobiologie
79 87 3 4 4 3 3 2 4 3 2 ... 2 0.083333 7 3 True counter 20 F Y Biomedisch
80 88 4 6 7 2 5 4 2 4 3 ... 3 0.750000 7 7 True clock 28 F Y Psychologie
81 89 1 8 0 5 4 4 4 2 5 ... 2 0.916667 9 1 True counter 18 F Y Psychologie
82 90 2 7 8 4 2 1 3 2 5 ... 2 0.250000 5 6 True clock 21 F Y Psychobiologie
83 91 3 6 6 5 5 5 5 5 5 ... 5 2.000000 4 7 True counter 25 F Y ACW/Kunstgeschiedenis
84 92 4 3 5 4 5 4 3 5 5 ... 3 1.250000 6 5 True clock 19 F Y Nederlands
85 93 1 6 5 5 4 4 5 3 3 ... 4 1.083333 7 3 True counter 22 M Y Psychobiologie
86 94 2 3 2 4 3 2 4 2 3 ... 2 0.083333 5 4 True clock 19 F Y Psychologie
87 95 3 3 6 5 5 4 4 3 3 ... 2 0.583333 7 3 True counter 21 F Y Psychologie
88 97 1 3 5 4 2 4 2 4 4 ... 5 0.833333 7 2 True counter 20 M Y Media & Cultuur
89 98 2 0 4 5 5 4 2 5 5 ... 5 1.166667 6 4 True clock 25 M Y Bestuurs- en Organisatiewetenschappen
90 99 3 1 7 5 2 2 4 4 4 ... 5 1.000000 5 3 True counter 21 F Y Psychologie
91 101 1 3 7 5 4 2 5 4 5 ... 4 1.000000 5 4 True counter 19 M Y Psychobiologie
92 102 2 2 6 1 1 1 5 2 5 ... 4 0.000000 5 3 True clock 20 M Y Psychobiologie
93 103 3 5 0 4 4 3 4 3 5 ... 2 0.666667 6 4 True counter 19 F Y Psychologie
94 105 1 4 4 4 4 4 4 4 3 ... 2 0.750000 4 7 True counter 22 F Y Psychobiologie
95 106 2 4 2 3 1 1 4 3 5 ... 3 0.333333 7 3 True clock 19 F Y Psychobiologie
96 107 3 5 7 5 3 4 4 4 5 ... 5 1.250000 8 0 True counter 20 M Y Psychologie
97 109 1 6 4 4 1 1 4 2 4 ... 2 0.083333 3 7 True counter 19 F Y ASW
98 110 2 0 10 5 4 2 5 4 2 ... 2 1.000000 4 0 True clock 22 F Y Psychologie
99 111 3 5 5 5 4 4 4 4 5 ... 5 1.333333 7 1 True counter 22 M Y Psychobiologie en filosofie
100 113 1 0 10 5 5 1 2 2 5 ... 4 0.333333 5 10 True counter 27 M Y Economie
101 115 3 2 6 5 4 2 4 4 5 ... 3 1.166667 7 4 True counter 18 F Y Psychologie

102 rows × 25 columns


In [3]:
k_rolls_clock = k_rolls[k_rolls["Rotation"] == "clock"]
k_rolls_counter = k_rolls[k_rolls["Rotation"] == "counter"]
# k_rolls_counter

Beeswarm plot

We create a beeswarm plot using the "seaborn" module


In [16]:
ax = sns.swarmplot(x="Rotation", y="mean_NEO", color = "coral" , data=k_rolls)
ax = sns.boxplot(x="Rotation", y="mean_NEO",hue="Rotation", palette="muted", data=k_rolls)

plt.show()


T-test

We now perform the t-test just like in the R Markdown document. For this, we have to extract numpy arrays from our pandas data frames


In [24]:
clock_scores = k_rolls_clock["mean_NEO"].values
counter_scores = k_rolls_counter["mean_NEO"].values

# run the t-test
scipy.stats.ttest_ind(clock_scores, counter_scores)


Out[24]:
Ttest_indResult(statistic=-0.75360532612302222, pvalue=0.45285696533484487)