The raw data behind the story Some People Are Too Superstitious To Have A Baby On Friday The 13th, which you can read here.
We'll be working with the data set from the Centers for Disease Control and Prevention's National National Center for Health Statistics.
The data set has the following structure:
In [1]:
f = open("../data/GP01/births.csv", 'r')
text = f.read()
print(text[:193])
In [2]:
lines_list = text.split("\n")
lines_list[:10]
Out[2]:
In [3]:
data_no_header = lines_list[1:len(lines_list)]
days_counts = dict()
for line in data_no_header:
split_line = line.split(",")
day_of_week = split_line[3]
num_births = int(split_line[4])
if day_of_week in days_counts:
days_counts[day_of_week] = days_counts[day_of_week] + num_births
else:
days_counts[day_of_week] = num_births
days_counts
Out[3]: