CZOData Example 1: Read and plot CZO_DisplayFile_v1 with Pandas. Read from a "CZO Display File v1", convert to Pandas DataFrame and plot time series. Writen by Anthony Aufdenkampe, Friday Dec. 13, 2013.
In [4]:
# Import all required Python libraries and modules
import pandas as pd
import matplotlib.pyplot as plt
In [5]:
# Create a list of file paths for the CZO Display Files (v1) to read
# Examples here available at http://czo.stroudcenter.org/data/ or http://criticalzone.org/christina/data/
file_paths = ['/Users/aaufdenkampe/Documents/Python/EnviroDataScripts/CZODisplayParsePlot/ExampleData/CRB_WCC_STAGEFLOW_2011.csv',
'/Users/aaufdenkampe/Documents/Python/EnviroDataScripts/CZODisplayParsePlot/ExampleData/CRB_WCC_STAGEFLOW_2012.csv'
]
In [6]:
# A For loop that reads each file using the Pandas "read_csv" function,
# then appends the resulting DataFrame object to a list called "data_frames".
data_frames = []
for file_path in file_paths:
df = pd.read_csv(file_path, header=0, skipinitialspace=True, skiprows=[1], index_col=0, na_values=[-9999], parse_dates=True)
data_frames.append(df)
In [7]:
# Concatenate all the DataFrames in the "data_frames" list into a single DataFrame
df = pd.concat(data_frames)
In [8]:
df
Out[8]:
In [9]:
df.index
Out[9]:
In [10]:
df.index = df.index.tz_localize('EST')
In [11]:
df.index
Out[11]:
In [12]:
df.dtypes
Out[12]:
In [13]:
df['Gage Height (ft) from Continuous record'] = pd.to_numeric(df['Gage Height (ft) from Continuous record'], errors='coerce')
df['Discharge (cfs) from Continuous record'] = pd.to_numeric(df['Discharge (cfs) from Continuous record'], errors='coerce')
In [14]:
df.dtypes
Out[14]:
In [15]:
df.head(n=5)
Out[15]:
In [16]:
df.columns
Out[16]:
In [17]:
%matplotlib inline
In [24]:
df.plot()
Out[24]:
In [35]:
ax = df['Discharge (cfs) from Continuous record'].plot(title=file_path, style='b', logy=True, ylim=(1,1000), legend=True)
ax.set_ylabel(u'Discharge (cfs) from Continuous record', color='b')
ax2 = df['Gage Height (ft) from Continuous record'].plot(secondary_y=True, style='g', legend=True)
ax2.set_ylabel(u'Gage Height (ft) from Continuous record', color='g')
Out[35]:
The data frame needs to be conversted to a csv format that can be read into HydyroShare. See https://help.hydroshare.org/hydroshare-resource-types/time-series/understanding-what-file-types-can-be-uploaded-into-a-time-series-resource/
In [68]:
df_export = df[['Gage Height (ft) from Continuous record', 'Discharge (cfs) from Continuous record']]
df_export.index.names = ['ValueDateTime']
In [69]:
df_export = df_export.rename(index=str, columns={"Discharge (cfs) from Continuous record": "discharge_cfs", "Gage Height (ft) from Continuous record": "stage_ft"})
In [74]:
df_export.head(n=5)
Out[74]:
In [75]:
df_export.to_csv('/Users/aaufdenkampe/Documents/Python/EnviroDataScripts/CZODisplayParsePlot/ExampleData/CRB_WCC_STAGEFLOW_from_df.csv')
In [ ]:
### Unfortunately, I can't get the CSV file to automatically parse into a HydroShare Time Series resource!