Physionet ECG-ID Database Files

This notebook downloads the ECG data from the Physionet ECG-ID Database in the form of tab-delimited .txt files with no header.

The file titles follow this pattern: Person##rec##.txt

The first colum of data is time (s), and the second column of data is ECG I filtered (mV).

According to the database description, there are 310 records in total, made up of between 2 and 20 records from each of 90 individuals. Each record is 20 seconds long.

In order for this notebook to work, you must first install the Physionet WFDB software and be able to run a %%bash cell. Download the RECORDS file from Physionet, and put it in the same directory as this notebook.

Click here to see the results of this notebook.

In [1]:
# Read the RECORDS file into a list
records = open('RECORDS','r')
rec_list =

# Get rid of the last line, which
# is an extra empty entry for some reason.
rec_list = rec_list.split('\n')
del rec_list[-1]

# Create a dict where the keys are the Person
# and the values are a list of corresponding rec
rec_dict = {}

for i in range(len(rec_list)):
    rec = rec_list[i].split('/')
    if rec[0] in rec_dict:
        rec_dict[rec[0]] = []

In [2]:
print len(rec_dict)
print rec_dict

{'Person_14': ['rec_1', 'rec_2', 'rec_3'], 'Person_15': ['rec_1', 'rec_2'], 'Person_16': ['rec_1', 'rec_2', 'rec_3'], 'Person_17': ['rec_1', 'rec_2'], 'Person_10': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_11': ['rec_1', 'rec_2', 'rec_3'], 'Person_12': ['rec_1', 'rec_2'], 'Person_13': ['rec_1', 'rec_2'], 'Person_18': ['rec_1', 'rec_2'], 'Person_19': ['rec_1', 'rec_2'], 'Person_90': ['rec_1', 'rec_2'], 'Person_61': ['rec_1', 'rec_2', 'rec_3', 'rec_4'], 'Person_60': ['rec_1', 'rec_2', 'rec_3'], 'Person_63': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5', 'rec_6'], 'Person_62': ['rec_1', 'rec_2', 'rec_3'], 'Person_65': ['rec_1', 'rec_2'], 'Person_64': ['rec_1', 'rec_2', 'rec_3'], 'Person_67': ['rec_1', 'rec_2', 'rec_3'], 'Person_66': ['rec_1', 'rec_2'], 'Person_69': ['rec_1', 'rec_2'], 'Person_68': ['rec_1', 'rec_2'], 'Person_72': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5', 'rec_6', 'rec_7', 'rec_8'], 'Person_73': ['rec_1', 'rec_2'], 'Person_70': ['rec_1', 'rec_2', 'rec_3'], 'Person_71': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_76': ['rec_1', 'rec_2', 'rec_3'], 'Person_77': ['rec_1', 'rec_2', 'rec_3'], 'Person_74': ['rec_1'], 'Person_75': ['rec_1', 'rec_2', 'rec_3'], 'Person_78': ['rec_1', 'rec_2'], 'Person_79': ['rec_1', 'rec_2'], 'Person_47': ['rec_1', 'rec_2'], 'Person_46': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_45': ['rec_1', 'rec_2'], 'Person_44': ['rec_1', 'rec_2'], 'Person_43': ['rec_1', 'rec_2'], 'Person_42': ['rec_1', 'rec_2', 'rec_3', 'rec_4'], 'Person_41': ['rec_1', 'rec_2'], 'Person_40': ['rec_1', 'rec_2', 'rec_3', 'rec_4'], 'Person_49': ['rec_1', 'rec_2'], 'Person_48': ['rec_1', 'rec_2'], 'Person_58': ['rec_1', 'rec_2'], 'Person_59': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_50': ['rec_1', 'rec_2'], 'Person_51': ['rec_1', 'rec_2', 'rec_3', 'rec_4'], 'Person_52': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5', 'rec_6', 'rec_7', 'rec_8', 'rec_9', 'rec_10', 'rec_11'], 'Person_53': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_54': ['rec_1', 'rec_2'], 'Person_55': ['rec_1', 'rec_2'], 'Person_56': ['rec_1', 'rec_2'], 'Person_57': ['rec_1', 'rec_2', 'rec_3'], 'Person_29': ['rec_1', 'rec_2'], 'Person_28': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_25': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_24': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_27': ['rec_1', 'rec_2', 'rec_3'], 'Person_26': ['rec_1', 'rec_2', 'rec_3', 'rec_4'], 'Person_21': ['rec_1', 'rec_2', 'rec_3'], 'Person_20': ['rec_1', 'rec_2'], 'Person_23': ['rec_1', 'rec_2'], 'Person_22': ['rec_1', 'rec_2'], 'Person_38': ['rec_1', 'rec_2'], 'Person_39': ['rec_1', 'rec_2'], 'Person_36': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_37': ['rec_1', 'rec_2'], 'Person_34': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_35': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_32': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5', 'rec_6'], 'Person_33': ['rec_1', 'rec_2'], 'Person_30': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_31': ['rec_1', 'rec_2'], 'Person_03': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_02': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5', 'rec_6', 'rec_7', 'rec_8', 'rec_9', 'rec_10', 'rec_11', 'rec_12', 'rec_13', 'rec_14', 'rec_15', 'rec_16', 'rec_17', 'rec_18', 'rec_19', 'rec_20', 'rec_21', 'rec_22'], 'Person_01': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5', 'rec_6', 'rec_7', 'rec_8', 'rec_9', 'rec_10', 'rec_11', 'rec_12', 'rec_13', 'rec_14', 'rec_15', 'rec_16', 'rec_17', 'rec_18', 'rec_19', 'rec_20'], 'Person_07': ['rec_1', 'rec_2'], 'Person_06': ['rec_1', 'rec_2'], 'Person_05': ['rec_1', 'rec_2'], 'Person_04': ['rec_1', 'rec_2'], 'Person_09': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5', 'rec_6', 'rec_7'], 'Person_08': ['rec_1', 'rec_2'], 'Person_89': ['rec_1', 'rec_2'], 'Person_88': ['rec_1', 'rec_2', 'rec_3'], 'Person_83': ['rec_1', 'rec_2'], 'Person_82': ['rec_1', 'rec_2'], 'Person_81': ['rec_1', 'rec_2'], 'Person_80': ['rec_1', 'rec_2'], 'Person_87': ['rec_1', 'rec_2'], 'Person_86': ['rec_1', 'rec_2'], 'Person_85': ['rec_1', 'rec_2', 'rec_3'], 'Person_84': ['rec_1', 'rec_2']}

In [3]:
# Generate a temporary file named Person##
# where each line is a rec entry
for k,v in rec_dict.iteritems():
    p = open('people','a')
    f = open(k,'w')
    for i in range(len(v)):

Use a %%bash cell to download Physionet files.

Delete the first 2 lines (headers) from each file.

In [ ]:
for i in $(cat people); do
    for ii in $(cat $i); do
        rdsamp -r ecgiddb/$i/$ii -H -f 0 -t 20 -v -ps -s 'ECG I filtered' >$i$ii.txt
        vi -c ':1,2d' -c ':wq' $i$ii.txt;

In [6]:
# Remove temporary files
import os
for k in rec_dict.iterkeys():