Version: Beta 2.0
Created by Abigail Dobyns and Ryan Thorpe
BASS: Biomedical Analysis Software Suite for event detection and signal processing.
Copyright (C) 2015 Abigail Dobyns
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>
Run the following code block to intialize the program.
run this block one time
In [1]:
from BASS import *
For help, check out the wiki: Protocol
Or the video tutorial: Coming Soon!
Use the following block to create a BASS_Dataset object and initialize your settings. All settings are attributes of the dataset instance. Manual initialization of settings in this block is optional and is required only once for a given batch. All BASS_Dataset objects that are initialized are automatically added to the batch.
class BASS_Dataset(inputDir, fileName, outputDir, fileType='plain', timeScale='seconds')
Attributes:
Batch: static list
Contains all instances of the BASS_Dataset object in order to be referenced by the global runBatch function.
Data: library
instance data
Settings: library
instance settings
Results: library
instance results
Methods:
run_analysis(settings = self.Settings, analysis_module): BASS_Dataset method
Highest level of the object-oriented analysis pipeline. First syncs the settings of all BASS_Dataset objects
(stored in Batch), then runs the specified analysis module on each one.
Analysis must be called after the object is initialize and Settings added if the Settings are to be added manually (not via the interactive check and load settings function). Analysis runs according to batch-oriented protocol and is specific to the analysis module determined by the "analysis_module" parameter.
Run BASS_Dataset.run_analysis(analysis_mod, settings, batch)
Runs in either single (batch=False) or batch mode. Assuming batch mode, this function first syncs settings of each dataset within Bass_Dataset.Batch to the entered parameter "settings", then runs analysis on each instance within Batch. Be sure to select the correct module given your desired type of analysis. The current options (as of 9/21/16) are "ekg" and "pleth". Parameters are as follows:
Parameters:
analysis_mod: string
the name of the BASS_Dataset module which will be used to analyze the batch of datasets
settings: string or dictionary
can be entered as the location of a settings file or the actual settings dictionary (default = self.Settings)
batch: boolean
determines if the analysis is performed on only the self-instance or as a batch on all object instances
(default=True)
For more information about other settings, go to:
In [2]:
#Import and Initialize
#Class BASS_Dataset(inputDir, fileName, outputDir, fileType='plain', timeScale='seconds')
data1 = BASS_Dataset('C:\\Users\\Ryan\\Desktop\\sample_data\\','2016-06-22_C1a_P3_Base.txt','C:\\Users\\Ryan\\Desktop\\bass_output\\pleth')
data2 = BASS_Dataset('C:\\Users\\Ryan\\Desktop\\sample_data\\','2016-06-26_C1a_P7_Base.txt','C:\\Users\\Ryan\\Desktop\\bass_output\\pleth')
#transformation Settings
data1.Settings['Absolute Value'] = False #Must be True if Savitzky-Golay is being used
data1.Settings['Bandpass Highcut'] = 12 #in Hz
data1.Settings['Bandpass Lowcut'] = 1 #in Hz
data1.Settings['Bandpass Polynomial'] = 1 #integer
data1.Settings['Linear Fit'] = False #between 0 and 1 on the whole time series
data1.Settings['Linear Fit-Rolling R'] = 0.5 #between 0 and 1
data1.Settings['Linear Fit-Rolling Window'] = 1000 #window for rolling mean for fit, unit is index not time
data1.Settings['Relative Baseline'] = 0 #default 0, unless data is normalized, then 1.0. Can be any float
data1.Settings['Savitzky-Golay Polynomial'] = 'none' #integer
data1.Settings['Savitzky-Golay Window Size'] = 'none' #must be odd. units are index not time
#Baseline Settings
data1.Settings['Baseline Type'] = r'rolling' #'linear', 'rolling', or 'static'
#For Linear
data1.Settings['Baseline Start'] = None #start time in seconds
data1.Settings['Baseline Stop'] = None #end time in seconds
#For Rolling
data1.Settings['Rolling Baseline Window'] = 5 # in seconds. leave as 'none' if linear or static
#Peaks
data1.Settings['Delta'] = 0.05
data1.Settings['Peak Minimum'] = -0.50 #amplitude value
data1.Settings['Peak Maximum'] = 0.50 #amplitude value
#Bursts
data1.Settings['Apnea Factor'] = 2 #factor to define apneas as a function of expiration
data1.Settings['Burst Area'] = True #calculate burst area
data1.Settings['Exclude Edges'] = True #False to keep edges, True to discard them
data1.Settings['Inter-event interval minimum (time-scale units)'] = 0.0001 #only for bursts, not for peaks
data1.Settings['Maximum Burst Duration (time-scale units)'] = 6
data1.Settings['Minimum Burst Duration (time-scale units)'] = 0.0001
data1.Settings['Minimum Peak Number'] = 1 #minimum number of peaks/burst, integer
data1.Settings['Threshold']= 0.0001 #linear: proportion of baseline.
#static: literal value.
#rolling, linear ammount grater than rolling baseline at each time point.
#Outputs
data1.Settings['Generate Graphs'] = False #create and save the fancy graph outputs
#Settings that you should not change unless you are a super advanced user:
#These are settings that are still in development
data1.Settings['Graph LCpro events'] = False
############################################################################################
data1.run_analysis('pleth', batch=False)
In [243]:
display_settings(Settings)
Out[243]:
In [3]:
#grouped summary for peaks
Results['Peaks-Master'].groupby(level=0).describe()
Out[3]:
In [58]:
#grouped summary for bursts
Results['Bursts-Master'].groupby(level=0).describe()
Out[58]:
In [3]:
#Interactive, single time series by Key
key = Settings['Label']
graph_ts(Data, Settings, Results, key)
In [5]:
key = Settings['Label']
start =550 #start time in seconds
end= 560#end time in seconds
results_timeseries_plot(key, start, end, Data, Settings, Results)
Display the Autocorrelation plot of your transformed data.
Choose the start and end time in seconds. to capture whole time series, use end = -1. May be slow
key = 'Mean1'
start = 0
end = 10
In [ ]:
#autocorrelation
key = Settings['Label']
start = 0 #seconds, where you want the slice to begin
end = 1 #seconds, where you want the slice to end.
autocorrelation_plot(Data['trans'][key][start:end])
plt.show()
Shows the temporal relationship of peaks in each column. Auto scales. Display only. Intended for more than one column of data
In [ ]:
#raster
raster(Data, Results)
In [5]:
event_type = 'Peaks'
meas = 'Intervals'
key = Settings['Label']
frequency_plot(event_type, meas, key, Data, Settings, Results)
In [ ]:
#Get average plots, display only
event_type = 'Peaks'
meas = 'Intervals'
average_measurement_plot(event_type, meas,Results)
In [81]:
#Batch
event_type = 'Bursts'
meas = 'Total Cycle Time'
Results = poincare_batch(event_type, meas, Data, Settings, Results)
pd.concat({'SD1':Results['Poincare SD1'],'SD2':Results['Poincare SD2']})
Out[81]:
In [6]:
#quick
event_type = 'Bursts'
meas = 'Attack'
key = Settings['Label']
poincare_plot(Results[event_type][key][meas])
The following blocks allows you to asses the power of event measuments in the frequency domain. While you can call this block on any event measurement, it is intended to be used on interval data (or at least data with units in seconds). Reccomended:
event_type = 'Bursts'
meas = 'Total Cycle Time'
key = 'Mean1'
scale = 'raw'
event_type = 'Peaks'
meas = 'Intervals'
key = 'Mean1'
scale = 'raw'
Because this data is in the frequency domain, we must interpolate it in order to perform a FFT on it. Does not support 'all'.
Use the code block below to specify your settings for event measurment PSD.
In [18]:
Settings['PSD-Event'] = Series(index = ['Hz','ULF', 'VLF', 'LF','HF','dx'])
#Set PSD ranges for power in band
Settings['PSD-Event']['hz'] = 100 #freqency that the interpolation and PSD are performed with.
Settings['PSD-Event']['ULF'] = 1 #max of the range of the ultra low freq band. range is 0:ulf
Settings['PSD-Event']['VLF'] = 2 #max of the range of the very low freq band. range is ulf:vlf
Settings['PSD-Event']['LF'] = 5 #max of the range of the low freq band. range is vlf:lf
Settings['PSD-Event']['HF'] = 50 #max of the range of the high freq band. range is lf:hf. hf can be no more than (hz/2)
Settings['PSD-Event']['dx'] = 10 #segmentation for the area under the curve.
In [19]:
event_type = 'Peaks'
meas = 'Intervals'
key = Settings['Label']
scale = 'raw'
Results = psd_event(event_type, meas, key, scale, Data, Settings, Results)
Results['PSD-Event'][key]
Out[19]:
Use the settings code block to set your frequency bands to calculate area under the curve. This block is not required. band output is always in raw power, even if the graph scale is dB/Hz.
In [ ]:
#optional
Settings['PSD-Signal'] = Series(index = ['ULF', 'VLF', 'LF','HF','dx'])
#Set PSD ranges for power in band
Settings['PSD-Signal']['ULF'] = 25 #max of the range of the ultra low freq band. range is 0:ulf
Settings['PSD-Signal']['VLF'] = 75 #max of the range of the very low freq band. range is ulf:vlf
Settings['PSD-Signal']['LF'] = 150 #max of the range of the low freq band. range is vlf:lf
Settings['PSD-Signal']['HF'] = 300 #max of the range of the high freq band. range is lf:hf. hf can be no more than (hz/2) where hz is the sampling frequency
Settings['PSD-Signal']['dx'] = 2 #segmentation for integration of the area under the curve.
Use the block below to generate the PSD graph and power in bands results (if selected). scale toggles which units to use for the graph:
raw = s^2/Hz
db = dB/Hz = 10*log10(s^2/Hz)
Graph and table are automatically saved in the PSD-Signal
subfolder.
In [ ]:
scale = 'raw' #raw or db
Results = psd_signal(version = 'original', key = 'Mean1', scale = scale,
Data = Data, Settings = Settings, Results = Results)
Results['PSD-Signal']
Use the block below to get the spectrogram of the signal. The frequency (y-axis) scales automatically to only show 'active' frequencies. This can take some time to run.
version = 'original'
key = 'Mean1'
After transformation is run, you can call version = 'trans'. This graph is not automatically saved.
In [ ]:
version = 'original'
key = Settings['Label']
spectogram(version, key, Data, Settings, Results)
Generates the moving mean, standard deviation, and count for a given measurement across all columns of the Data in the form of a DataFrame (displayed as a table). Saves out the dataframes of these three results automatically with the window size in the name as a .csv. If meas == 'All', then the function will loop and produce these tables for all measurements.
event_type = 'Peaks'
meas = 'all'
window = 30
In [93]:
#Moving Stats
event_type = 'Bursts'
meas = 'Total Cycle Time'
window = 30 #seconds
Results = moving_statistics(event_type, meas, window, Data, Settings, Results)
Calculates the histogram entropy of a measurement for each column of data. Also saves the histogram of each. If meas is set to 'all', then all available measurements from the event_type chosen will be calculated iteratevely.
If all of the samples fall in one bin regardless of the bin size, it means we have the most predictable sitution and the entropy is 0. If we have uniformly dist function, the max entropy will be 1
event_type = 'Bursts'
meas = 'all'
In [82]:
#Histogram Entropy
event_type = 'Bursts'
meas = 'all'
Results = histent_wrapper(event_type, meas, Data, Settings, Results)
Results['Histogram Entropy']
Out[82]:
this only runs if you have pyeeg.py in the same folder as this notebook and bass.py. WARNING: THIS FUNCTION RUNS SLOWLY
run the below code to get the approximate entropy of any measurement or raw signal. Returns the entropy of the entire results array (no windowing). I am using the following M and R values:
M = 2
R = 0.2*std(measurement)
these values can be modified in the source code. alternatively, you can call ap_entropy directly. supports 'all'
Interpretation: A time series containing many repetitive patterns has a relatively small ApEn; a less predictable process has a higher ApEn.
In [ ]:
#Approximate Entropy
event_type = 'Peaks'
meas = 'Intervals'
Results = ap_entropy_wrapper(event_type, meas, Data, Settings, Results)
Results['Approximate Entropy']
In [ ]:
#Approximate Entropy on raw signal
#takes a VERY long time
from pyeeg import ap_entropy
version = 'original' #original, trans, shift, or rolling
key = Settings['Label'] #Mean1 default key for one time series
start = 0 #seconds, where you want the slice to begin
end = 1 #seconds, where you want the slice to end. The absolute end is -1
ap_entropy(Data[version][key][start:end].tolist(), 2, (0.2*np.std(Data[version][key][start:end])))
this only runs if you have pyeeg.py in the same folder as this notebook and bass.py. WARNING: THIS FUNCTION RUNS SLOWLY
run the below code to get the sample entropy of any measurement. Returns the entropy of the entire results array (no windowing). I am using the following M and R values:
M = 2
R = 0.2*std(measurement)
these values can be modified in the source code. alternatively, you can call samp_entropy directly. Supports 'all'
In [73]:
#Sample Entropy
event_type = 'Bursts'
meas = 'Total Cycle Time'
Results = samp_entropy_wrapper(event_type, meas, Data, Settings, Results)
Results['Sample Entropy']
Out[73]:
In [74]:
Results['Sample Entropy']['Attack']
Out[74]:
In [ ]:
#on raw signal
#takes a VERY long time
version = 'original' #original, trans, shift, or rolling
key = Settings['Label']
start = 0 #seconds, where you want the slice to begin
end = 1 #seconds, where you want the slice to end. The absolute end is -1
samp_entropy(Data[version][key][start:end].tolist(), 2, (0.2*np.std(Data[version][key][start:end])))
While not completely up to date with some of the new changes, the Wiki can be useful if you have questions about some of the settings: https://github.com/drcgw/SWAN/wiki/Tutorial
Stuck on a particular step or function? Try typing the function name followed by two ??. This will pop up the docstring and source code. You can also call help() to have the notebook print the doc string.
Example:
analyze??
help(analyze)
In [ ]:
help(moving_statistics)
In [ ]:
moving_statistics??
In [ ]: