pyphysio tutorial

3. Pipelines

In this tutorial we consider two pipelines for the processing of ECG and EDA signals respectively.

We divide the pipelines into three separate steps:

  1. Filtering and Preprocessing: this step includes all the procedures aiming at increasing the signal/noise ratio, typycally band-pass filtering, smoothing, removal of artifacts. The output of this step is a new version of the imput signal with improved signal quality (less noise);
  2. Information Extraction: this step aims at extracting the information of interest from the physiological signal. The output is a new signal containing only the information of interest and thus it has a signal_nature different from the input signal.
  3. Physiological Indicators: this steps produces a list of scalar values able to describe the characteristics of the input signal. This step is usually performed on small segments of the input signals which are extracted using a sliding window on the whole length of the signal.

Figure 1: Representation of the three main steps of a signal processing pipeline. Below: example of the ideal results of each block on a Photo-PlethysmoGraph (PPG) signal.

To understand how a signal processing step is represented in pyphysio see tutorial 2-algorithms

In the following code we will also use the shortened sintax to apply a signal processing step:

# standard sintax: creation + execution
filter_iir = ph.IIRFilter(fp=45, fs = 50, ftype='ellip') # creation of the processing step
ecg_out = filter_iir(ecg) # execution on the input signal


# shortened sintax: creation(execution)
ecg_out = ph.IIRFilter(fp=45, fs = 50, ftype='ellip')(ecg)

However, the standard sintax is suggested when applying the same processing steps to multiple signals or for the clarity of the scripts:

Es:

filter_iir = ph.IIRFilter(fp=45, fs = 50, ftype='ellip') # creation of the processing step

ecg_1_out = filter_iir(ecg_1) # execution on the input signal
ecg_2_out = filter_iir(ecg_2) # execution on the input signal
ecg_3_out = filter_iir(ecg_3) # execution on the input signal

3.1 ECG processing pipeline

Step 0: Import data


In [1]:
# import libraries
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

In [2]:
# import all pyphysio classes and methods
import pyphysio as ph


Please cite:
Bizzego et al. (2019) 'pyphysio: A physiological signal processing library for data science approaches in physiology', SoftwareX

In [3]:
# import data and creating a signal
import pyphysio.tests as test
ecg_data = test.TestData.ecg()

fsamp = 2048
ecg = ph.EvenlySignal(values = ecg_data, sampling_freq = fsamp, signal_type = 'ecg')

In [4]:
ecg.plot()


Out[4]:
[<matplotlib.lines.Line2D at 0x7feacac24588>]

To better understand how to develop a signal processing pipelines we imagine that the physiological signals were collected during an experiment in which the subject watched two images with different emotional content.

Specifically the experiment is composed of four parts:

  • [ 0s - 30s] - baseline
  • [ 30s - 60s] - image 1
  • [ 60s - 90s] - baseline
  • [ 90s - 120s] - image 2

We store the information about the experimental sessions in an EvenlySignal signal appositely created:


In [5]:
# create label
label = np.zeros(1200)
label[300:600] = 1
label[900:1200] = 2

label = ph.EvenlySignal(label, sampling_freq = 10, signal_type = 'label')

Note that in some other cases a similar signal might be provided by design of the experiment, for instance by using markers or triggers.

Step 1: Filtering and preprocessing


In [6]:
# (optional) IIR filtering : remove high frequency noise
ecg = ph.IIRFilter(fp=45, fs = 50, ftype='ellip')(ecg)

In [7]:
# normalization : normalize data
ecg = ph.Normalize(norm_method='standard')(ecg)

In [8]:
# resampling : increase the sampling frequency by cubic interpolation
ecg = ecg.resample(fout=4096, kind='cubic')
fsamp = 4096

In [9]:
ecg.plot()


Out[9]:
[<matplotlib.lines.Line2D at 0x7feac850b438>]

Step 2: Information Extraction

The information we want to extract from the ECG signal is the position of the heartbeats and the Inter Beat Interval signal.


In [10]:
ibi = ph.BeatFromECG()(ecg)

In [11]:
ibi.get_duration()


Out[11]:
120.0

In [12]:
# check results so far
ax1 = plt.subplot(211)
ecg.plot()
plt.vlines(ibi.get_times(), np.min(ecg), np.max(ecg))

plt.subplot(212, sharex = ax1)
ibi.plot('o-')
plt.vlines(ibi.get_times(), np.min(ibi), np.max(ibi))
plt.show()



In [13]:
# edit IBI
# ibi_ok = ph.Annotate(ecg, ibi)()

Step 3: Physiological Indicators


In [14]:
# check label
ax1 = plt.subplot(211)
ibi.plot('.-')

plt.subplot(212, sharex = ax1)
label.plot('.-')
plt.show()



In [15]:
# define a list of indicators we want to compute
hrv_indicators = [ph.Mean(name='RRmean'), ph.StDev(name='RRstd'), ph.RMSSD(name='rmsSD')]

In [16]:
#fixed length windowing
fixed_length = ph.FixedSegments(step = 5, width = 10, labels = label)

indicators, col_names = ph.fmap(fixed_length, hrv_indicators, ibi)

In [17]:
# extract column with the labels for each window
label_w = indicators[:, np.where(col_names == 'label')[0]]

# extract column with the RRmean values computed from each window
rrmean_w = indicators[:, np.where(col_names == 'RRmean')[0]]

rrmean_image1 = rrmean_w[np.where(label_w==1)[0]].ravel()
rrmean_image2 = rrmean_w[np.where(label_w==2)[0]].ravel()

In [18]:
## create a box and whisker plot to compare the distibution of the RRmean indicator
plt.boxplot([rrmean_image1, rrmean_image2],
            labels=['image1', 'image2'])


Out[18]:
{'whiskers': [<matplotlib.lines.Line2D at 0x7feace97a8d0>,
  <matplotlib.lines.Line2D at 0x7feace97ac50>,
  <matplotlib.lines.Line2D at 0x7feace912080>,
  <matplotlib.lines.Line2D at 0x7feace9123c8>],
 'caps': [<matplotlib.lines.Line2D at 0x7feace97af98>,
  <matplotlib.lines.Line2D at 0x7feace986320>,
  <matplotlib.lines.Line2D at 0x7feace912710>,
  <matplotlib.lines.Line2D at 0x7feace912a58>],
 'boxes': [<matplotlib.lines.Line2D at 0x7feace97a550>,
  <matplotlib.lines.Line2D at 0x7feace986cc0>],
 'medians': [<matplotlib.lines.Line2D at 0x7feace986668>,
  <matplotlib.lines.Line2D at 0x7feace912da0>],
 'fliers': [<matplotlib.lines.Line2D at 0x7feace9869b0>,
  <matplotlib.lines.Line2D at 0x7feace91e128>],
 'means': []}

pyphysio provides by default some presets of standard indicators for Heart Rate Variability and Electrodermal activity analysis.

Each indicator has its own pre-defined parameters:


In [19]:
HRV_FD = ph.preset_hrv_fd() #frequency domain HRV indicators
print(HRV_FD)

print(HRV_FD[0].get())


[IBI_VLF_Pow, IBI_LF_Pow, IBI_HF_Pow, IBI_Total_Pow]
{'freq_min': 1e-05, 'freq_max': 0.04, 'method': 'ar', 'interp_freq': 4, 'name': 'IBI_VLF_Pow'}

If a customization of the indicators is not required, it is easier to directly use them instead of manually define each indicator:


In [20]:
FD_HRV_ind, col_names = ph.fmap(fixed_length, ph.preset_hrv_fd(), ibi.resample(4))

If you need to export the results, for instance in a .csv datafile, you can use pandas:


In [21]:
import pandas as pd

# create a pandas dataframe
FD_HRV_df = pd.DataFrame(FD_HRV_ind, columns=col_names)

FD_HRV_df


Out[21]:
begin end label IBI_VLF_Pow IBI_LF_Pow IBI_HF_Pow IBI_Total_Pow
0 0.850098 10.850098 0.0 59.516601 203.751677 391.404224 1309.709483
1 5.850098 15.850098 0.0 264.439856 843.781813 394.526680 2082.863133
2 10.850098 20.850098 0.0 132.541890 349.282597 590.365659 1687.926380
3 15.850098 25.850098 0.0 89.239632 290.723129 409.507018 1423.139258
4 30.850098 40.850098 1.0 100.549819 316.695904 514.396912 1570.979730
5 35.850098 45.850098 1.0 230.816440 593.825096 750.872661 2177.812320
6 40.850098 50.850098 1.0 76.334252 266.728378 662.535907 1788.202972
7 45.850098 55.850098 1.0 173.964370 653.045174 523.428901 1961.157784
8 60.850098 70.850098 0.0 183.219677 593.111776 390.872019 1769.266383
9 65.850098 75.850098 0.0 275.869859 1173.359213 740.084483 2736.346689
10 70.850098 80.850098 0.0 64.312217 286.784686 795.383133 1771.060922
11 75.850098 85.850098 0.0 222.644422 726.048345 893.153690 2401.287011
12 90.850098 100.850098 2.0 256.708599 814.222175 741.991231 2365.930261
13 95.850098 105.850098 2.0 99.584297 534.540706 720.614651 1942.897032
14 100.850098 110.850098 2.0 239.481419 989.035020 606.820660 2410.078054
15 105.850098 115.850098 2.0 125.228110 636.315143 626.826865 1963.559251

In [22]:
# and save it to a .csv file:

#FD_HRV_df.to_csv('filename.csv')

In [23]:
# BE CAREFUL when executing the above command (uncomment first) 
# as it will save a new file in your current working directory.
# To check your current working directory:
import os
print(os.getcwd())


/media/veracrypt1/UniTn/software/pyphysio/tutorials

3.2 EDA processing pipeline

Step 0: Import data


In [24]:
# import data and creating a signal

eda_data = test.TestData.eda()

fsamp = 2048
eda = ph.EvenlySignal(values = eda_data, sampling_freq = fsamp, signal_type = 'eda')

eda.plot()


Out[24]:
[<matplotlib.lines.Line2D at 0x7feacc166c18>]

Step 1: Filtering and preprocessing


In [25]:
# resampling : decrease the sampling frequency by cubic interpolation
eda = eda.resample(fout=8, kind='cubic')

In [26]:
# IIR filtering : remove high frequency noise
eda_filt = ph.IIRFilter(fp=0.8, fs = 1.1, ftype='ellip')(eda)

In [27]:
eda.plot()
eda_filt.plot()
eda = eda_filt


Step 2: Information Extraction

The information we want to extract from the EDA signal is the phasic component associated to the sympathetic activity.


In [28]:
# estimate the driver function
driver = ph.DriverEstim()(eda)

In [29]:
eda.plot()
driver.plot()


Out[29]:
[<matplotlib.lines.Line2D at 0x7feace9a5c88>]

In [30]:
# compute the phasic component
phasic, tonic, _ = ph.PhasicEstim(delta=0.02)(driver)
eda.plot()
driver.plot()
phasic.plot()


Out[30]:
[<matplotlib.lines.Line2D at 0x7feacc0a0a58>]

In [31]:
# check results so far
plt.figure()
ax1 = plt.subplot(211)
eda.plot()
label.plot()

plt.subplot(212, sharex = ax1)
driver.plot()
phasic.plot()
plt.grid()
plt.show()


Step 3: Physiological Indicators


In [32]:
#fixed length windowing
fixed_length = ph.FixedSegments(step = 5, width = 20, labels = label)

# we use the preset indicators for the phasic signal.
# We need to define the minimum amplitude of the peaks that will be considered
PHA_ind, col_names = ph.fmap(fixed_length, ph.preset_phasic(delta=0.02), phasic)

In [33]:
## Box-Whisker plot
## extract column with the labels for each window
label_w = PHA_ind[:, np.where(col_names == 'label')[0]]

## extract column with the PksMean values
## computed from each window
pksmean_w = PHA_ind[:, np.where(col_names == 'pha_PeaksMean')[0]]

pksmean_image1 = pksmean_w[np.where(label_w==1)[0]]
pksmean_image2 = pksmean_w[np.where(label_w==2)[0]]

## create a box and whisker plot 
## to compate the distibution of the RRmean indicator
plt.figure()
plt.boxplot([pksmean_image1, pksmean_image2], 
            labels=['image1', 'image2'])
plt.show()



In [ ]: