It is considered good practice to import all the modules you use in a notebook in the beginning, so we'll start with that:
In [3]:
import string
We'll be using two lists defined in the string-module:
In [34]:
print(string.ascii_lowercase)
print(type(string.ascii_lowercase))
print(string.digits)
In [9]:
logfile_name = '../src/logs/0023_FCA_2017-03-09.log'
Open the file, read the lines & close the file.
In [10]:
fp = open(logfile_name, 'r')
all_lines = fp.readlines()
fp.close()
Display the first ten lines. For this, you can use the slice-syntax [:10], which reads: 'from the start to index 10'.
In [30]:
all_lines[:10]
Out[30]:
The first five lines are comments, which we'll want to skip over. How many events are there in the file (how many rows after the comments)?
In [12]:
len(all_lines[5:])
Out[12]:
In [14]:
field_sep = '\t' # COMPLETE THIS LINE
Split the 6th line and display:
In [15]:
line = all_lines[5]
split_line = line.split(field_sep)
print(split_line)
The 1st value of the split list is the time, the 3rd value contains information on whether the event was a stimulus presentation, or a response. Since the data is consistent, to get the actual stimulus presented (letter or digit), we can simply count how many characters 'in' the equal-sign is: the index of the stimulus is:
In [19]:
# what is the index of the stimulus?
# Try changing the relevant value below until you get 'x'
split_line[2][5]
Out[19]:
In [20]:
idx = 5 # which index gives you the letter/digit?
Note that this index is also the one we need for getting to the response (1 or 2).
int-function)RT') and print it
In [22]:
# 6th line: STIM
line = all_lines[5]
split_line = line.split(field_sep)
print(split_line)
stim_time = split_line[0] # replace XXX!
cur_stim = split_line[2][idx] # replace YYY!
print(stim_time, cur_stim)
# 7th line: RESP
line = all_lines[6]
split_line = line.split(field_sep)
print(split_line)
resp_time = split_line[0] # replace XXX!
cur_resp = split_line[2][idx] # replace YYY!
print(resp_time, cur_resp)
# calculate RT
RT = int(resp_time) - int(stim_time) # formula here
print('reaction time: ', RT)
Convert the above into something that can be used to loop over the list. Start by just looping over the 6th and 7th rows: you should arrive at the same answer as above.
You'll need logic for determining whether the current line starts with the string STIM. Strings have a method startswith for this! Use an if-else-construct.
In [26]:
'STIM=x\n'.startswith('STIM')
Out[26]:
In [32]:
for line in all_lines[5:]:
split_line = line.split(field_sep)
# does the 3rd element of the list start with 'STIM'?
if split_line[2].startswith('STIM'):
stim_time = split_line[0]
cur_stim = split_line[2][idx]
# print(stim_time, cur_stim)
else: # nope; it starts with something other than 'STIM'
resp_time = split_line[0] # replace XXX!
cur_resp = split_line[2][idx] # replace YYY!
# print(resp_time, cur_resp)
# calculate RT
RT = int(resp_time) - int(stim_time) # formula here
# print('reaction time: ', RT)
Instead of printing out 1280 RT values, we want to save them into memory for later use (we need to calculate mean and median values over them). Start with two empty lists for reaction times:
and use the .append-method to add the values to the lists.
In [33]:
# empty lists for reaction times
rt_freq = []
rt_rare = []
In [37]:
for line in all_lines[5:]:
split_line = line.split(field_sep)
# does the 3rd element of the list start with 'STIM'?
if split_line[2].startswith('STIM'):
stim_time = split_line[0]
cur_stim = split_line[2][idx]
else: # nope; it starts with something other than 'STIM'
resp_time = split_line[0] # replace XXX!
cur_resp = split_line[2][idx] # replace YYY!
# calculate RT
RT = int(resp_time) - int(stim_time) # formula here
# test if the current stimulus is in the `ascii_lowercase`-list
if cur_stim in string.ascii_lowercase:
rt_freq.append(RT)
# else test if the current stimulus is in the `digits`-list
elif cur_stim in string.digits:
rt_rare.append(RT)
In [51]:
rt_freq = []
rt_rare = []
n_corr_freq = 0
n_corr_rare = 0
In [52]:
for line in all_lines[5:]:
split_line = line.split(field_sep)
# does the 3rd element of the list start with 'STIM'?
if split_line[2].startswith('STIM'):
stim_time = split_line[0]
cur_stim = split_line[2][idx]
else: # nope; it starts with something other than 'STIM'
resp_time = split_line[0] # replace XXX!
cur_resp = split_line[2][idx] # replace YYY!
# calculate RT
RT = int(resp_time) - int(stim_time) # formula here
# test if the current stimulus is in the `ascii_lowercase`-list
if cur_stim in string.ascii_lowercase:
rt_freq.append(RT)
if int(cur_resp) == 1:
n_corr_freq = n_corr_freq + 1
# else test if the current stimulus is in the `digits`-list
elif cur_stim in string.digits:
rt_rare.append(RT)
if cur_resp == '2':
n_corr_rare = n_corr_rare + 1
In [54]:
rt_freq[:10]
Out[54]:
100e-3 (i.e 0.1) to obtain milliseconds
In [50]:
# copy-paste your mean- and median-function here:
def mean(values):
return(sum(values)/len(values))
def median(values):
return(sorted(values)[len(values) // 2])
In [58]:
# freq
mean_rt_freq = 0.1 * mean(rt_freq)
median_rt_freq = 0.1 * median(rt_freq)
accuracy_freq = 100 * n_corr_freq / len(rt_freq)
# rare
mean_rt_rare = 100e-3 * mean(rt_rare)
median_rt_rare = 100e-3 * median(rt_rare)
accuracy_rare = 100 * n_corr_rare / len(rt_rare)
In [59]:
print('Frequent category:')
print('------------------')
print('Mean:', mean_rt_freq)
print('Median:', median_rt_freq)
print('Accuracy:', accuracy_freq)
In [60]:
print('Rare category:')
print('--------------')
print('Mean:', mean_rt_rare)
print('Median:', median_rt_rare)
print('Accuracy:', accuracy_rare)
In [61]:
def read_log_file(logfile_name, field_sep='\t'):
'''Read a single log file
The default field-separator is set to be the tab-character (\t)
Return the mean and median RT, and the accuracy, separately for
the frequent and rare categories. This is done as a list (tuple) of
6 return values, in the order:
(mean_rt_freq, median_rt_freq, accuracy_freq,
mean_rt_rare, median_rt_rare, accuracy_rare)
'''
# initialise
rt_freq = []
rt_rare = []
n_corr_freq = 0
n_corr_rare = 0
# open file and read all its lines into a list
fp = open(logfile_name, 'r')
all_lines = fp.readlines()
fp.close()
# hard-code the index of the stimulus/response type/number
idx = 5
# loop over lines from 6th onwards
for line in all_lines[5:]:
split_line = line.split(field_sep)
# does the 3rd element of the list start with 'STIM'?
if split_line[2].startswith('STIM'):
stim_time = split_line[0]
cur_stim = split_line[2][idx]
else: # nope; it starts with something other than 'STIM'
resp_time = split_line[0] # replace XXX!
cur_resp = split_line[2][idx] # replace YYY!
# calculate RT
RT = int(resp_time) - int(stim_time) # formula here
# test if the current stimulus is in the `ascii_lowercase`-list
if cur_stim in string.ascii_lowercase:
rt_freq.append(RT)
if int(cur_resp) == 1:
n_corr_freq = n_corr_freq + 1
# else test if the current stimulus is in the `digits`-list
elif cur_stim in string.digits:
rt_rare.append(RT)
if cur_resp == '2':
n_corr_rare = n_corr_rare + 1
# freq
mean_rt_freq = 0.1 * mean(rt_freq)
median_rt_freq = 0.1 * median(rt_freq)
accuracy_freq = 100 * n_corr_freq / len(rt_freq)
# rare
mean_rt_rare = 100e-3 * mean(rt_rare)
median_rt_rare = 100e-3 * median(rt_rare)
accuracy_rare = 100 * n_corr_rare / len(rt_rare)
return(mean_rt_freq, median_rt_freq, accuracy_freq,
mean_rt_rare, median_rt_rare, accuracy_rare)
In [62]:
(mean_rt_freq, median_rt_freq, accuracy_freq,
mean_rt_rare, median_rt_rare, accuracy_rare) = read_log_file(logfile_name)
In [63]:
print('Frequent category:')
print('------------------')
print('Mean:', mean_rt_freq)
print('Median:', median_rt_freq)
print('Accuracy:', accuracy_freq)
In [64]:
print('Rare category:')
print('--------------')
print('Mean:', mean_rt_rare)
print('Median:', median_rt_rare)
print('Accuracy:', accuracy_rare)
In [65]:
logfile_name = '../src/logs/0048_MSB_2016-09-23.log'
In [66]:
(mean_rt_freq, median_rt_freq, accuracy_freq,
mean_rt_rare, median_rt_rare, accuracy_rare) = read_log_file(logfile_name)
In [67]:
print('Frequent category:')
print('------------------')
print('Mean:', mean_rt_freq)
print('Median:', median_rt_freq)
print('Accuracy:', accuracy_freq)
In [68]:
print('Rare category:')
print('--------------')
print('Mean:', mean_rt_rare)
print('Median:', median_rt_rare)
print('Accuracy:', accuracy_rare)
In [ ]: