Overview of the ACT Python Library

The ACT library was developed for use by the Atmospheric Chemistry and Technology Lab at Washington University in St. Louis. The library allows the following to be easily done: 1. Import and Export data for various instruments including the PAM, VAPS, and Thermo Scientific Gas Analyzers 2. Plot data for the various instruments using the matplotlib library 3. Analyze data using python with pandas The entire library is heavily dependant on pandas, which means it is also dependant on numpy, scipy, and matplotlib. The project is open source and licensed under the MIT license. It can be found on github at http://github.com/dhhagan/ACT

1. General Structure of the Library:

/ACT /ACT /pam /thermo io.py visualize.py /vaps There are three primary subcomponents to the library: 1. pam - ability to merge the two data files generated by the PAM into one csv/xlsx file for safekeeping - plot the data generated by the PAM for funsies 2. thermo - Input/Output of data files from the thermo scientific gas analyzers - plot diurnal profile of trace gases - plot trace gas concentrations over time 3. vaps - Plot important variables necessary to debug the instrument and monitor set points of the PID's

2. Installation of the Library

Three simple steps: 1. Download the library from http://github.com/dhhagan/ACT or just git clone it into the directory of your choosing 2. Go to your command prompt and navigate to the directory you just downloaded the library into 3. Run the command: python setup.py Tada!

3. Using the ACT Library

3.1 Importing data from the Thermo Gas Analyzers

Now that the library is installed, you should be able to easily import it. After importing, we are going to set runDir to the directory where the thermo analyzer data is held.

In [1]:
import ACT

runDir = "C:\Users\David\Dropbox\Dhruv and David work\SLAQRS-I Data\Thermo Analyzer Data"
You have many options as far as importing the data goes; here are a few: 1. You can import the data for a single instrument from the raw .dat files created by the instrument itself There are a few arguments that can be sent to the read_thermo_dat function: 1. model: either 'nox', 'sox', or 'o3' depending on the instrument you want (default is nox) 2. runDir: full directory where the data is located (default is current directory) 3. sample_int: interval of data sampled (default is 1min) 4. start: date for first file you want to read 5. end: date for last file you want to read Normally, I just define the model and runDir and leave everything else to default. The obvious exception is if you are only interested in certain dates. This function returns the number of files read and a DataFrame containing the data

3.1.1 Import Data from .dat Files


In [2]:
# Read in data for the o3 analyzer for all dates:
%time files, o3 = ACT.read_thermo_dat('o3', runDir)

o3.info()


Wall time: 5.52 s
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 70910 entries, 2013-08-05 12:01:00 to 2013-09-23 17:50:00
Freq: T
Data columns (total 10 columns):
bncht     31718 non-null float64
cellai    31718 non-null float64
cellbi    31718 non-null float64
flowa     31718 non-null float64
flowb     31718 non-null float64
hio3      31718 non-null float64
lmpt      31718 non-null float64
o3        31718 non-null float64
o3lt      31718 non-null float64
pres      31718 non-null float64
dtypes: float64(10)
Now we have all of the o3 data between August 5 and September 23 at 1 min intervals. If we want to read in all of the nox data for the month of september, we can do the following:

In [3]:
%time files, nox = ACT.read_thermo_dat('nox', runDir, start='9-1-2013', end='9-30-2013')


Wall time: 7.38 s

3.1.2 Import Data from csv/xlsx Files

You can import the data for one or more instruments from either a csv or xlsx file using the functions read_thermo_csv() and read_thermo_xlsx() 1. read_thermo_csv(filename, runDir, sample_int): To import data from a csv, data should only be in one worksheet and should start in the first cell of the first column with the first row containing th eheaders and the first column containing the timestamp index 2. read_thermo_xlsx(filename, sheetname, runDir, sample_int, skiprows): filename: well, it's just the name of the file sheetname: name of the sheet containing the data runDir: directory containing the file sample_int: sample interval skiprows: number of rows to skip at the beginning of the spreadsheet

In [4]:
dataCSV = ACT.read_thermo_csv("SLAQRS.csv", runDir)
dataXLSX = ACT.read_thermo_xlsx("SLAQRS.xlsx",runDir=runDir)

3.2 Plot Data from Thermo Analyzers

There are a few plots we use often that I will show including plotting a diurnal profile for one instrument and multiple instruments, plotting the debugging charts for the analyzers, and plotting gas concentrations over a period of time

3.2.1 Debugging Charts for the Thermo Scientific Gas Analyzers

To plot the debugging chart, use the ThermoPlot class in ACT.thermo.visualize You must have the data contained in the .dat files, so I reccomend just reading the data straight from these files. Although you can certainly plot more than one day of data, It takes a while and looks pretty messy. Note, if you wanted to plot between two days, use the notation o3['day 1':'day 2']. You can also plot between specific times of specific days by including the time in the date/time stamp as well. You can send a settings argument to the debug_plot() function that provides you with complete control over the titles, axes labels, colors, and many other fun things. The following paramaters are available 1. instrument 2. title 3. xlabel 4. ylabpressure 5. ylabgas 6. ylabtemp 7. title_fontsize 8. labels_fontsize 9. grid To see how to set the arguments, look at the third example below.

In [5]:
# Read in data for the o3 analyzer for all dates:
files, o3 = ACT.read_thermo_dat('o3', runDir)

data = ACT.thermo.visualize.ThermoPlot(o3['9-23-2013'])

fig, (ax1, ax2, ax3) = data.debug_plot()



In [6]:
# Read in data for the NOx analyzer for all dates:
%time files, nox = ACT.read_thermo_dat('nox', runDir)

data = ACT.thermo.visualize.ThermoPlot(nox['9-23-2013'])

fig, (a1, a2, a3) = data.debug_plot()


Wall time: 16.1 s

In [7]:
# Fun with arguments
arguments = {
    'title':'So many things to change!',
    'grid':False,
    'xlabel':'This is fun!'
    }

files, sox = ACT.read_thermo_dat('sox', runDir)
data = ACT.thermo.visualize.ThermoPlot(sox['9-23-2013'])

fig, (a1, a2, a3) = data.debug_plot(args=arguments)


3.2.2 Plotting the Diurnal Profile

There are two major options currently supported in this library: 1. Diurnal Profile for NOx, So2, O3 2. Diurnal Profile for one gas Options/Arguments 1. data : CSV Containing timestamp, so2, nox, no2, no, o3 2. dates : You can choose one date (dates=['3-1-2014']) or a range (dates=['3-1-2013','3-15-2013']), or leave it blank to plot all dates 3. shaded : True or False 4. title : String containing an alternate title 5. xlabel : String containing an alternate x label Notes: 1. We really only need the concentrations for these so it may be easier to just read it in from the CSV, but I plan on building

In [8]:
# Get the data from a CSV
dataCSV = ACT.read_thermo_csv("SLAQRS.csv", runDir)

# Plot with argument shaded=True
f, (a1, a2, a3) = ACT.diurnal_plot(dataCSV,shaded=True)



In [9]:
# Plot with argument shaded=True
f, ax = ACT.diurnal_plot_single(dataCSV, model='nox', shaded=True, color1='green')


Features for the Future: 1. Ability to mesh together data from the thermo analyzers without having to go through a CSV/XLSX file 2. Ability to use different column names as defined by user, not the stupid thermo instrument 3. Plots for VAPS 4. Plots for PAM