This data is compiled to digitally observe the Sauk-Suiattle Watershed, powered by HydroShare.
Use this Jupyter Notebook to:
Migrate data sets from prior data download events,
Compute daily, monthly, and annual temperature and precipitation statistics,
Visualize precipitation results relative to the forcing data,
Visualize the time-series trends among the gridded cells using different Gridded data products.
To run this notebook, we must import several libaries. These are listed in order of 1) Python standard libraries, 2) hs_utils library provides functions for interacting with HydroShare, including resource querying, dowloading and creation, and 3) the observatory_gridded_hydromet library that is downloaded with this notebook.
In [ ]:
#conda install -c conda-forge basemap-data-hires --yes
In [ ]:
# data processing
import os
import pandas as pd, numpy as np, dask, json
import ogh
import geopandas as gpd
# data migration library
from utilities import hydroshare
# plotting and shape libraries
%matplotlib inline
# silencing warning
import warnings
warnings.filterwarnings("ignore")
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
# spatial plotting
import fiona
import shapely.ops
from shapely.geometry import MultiPolygon, shape, point, box, Polygon
from mpl_toolkits.basemap import Basemap
# # spatial plotting
# import fiona
# import shapely.ops
# from shapely.geometry import MultiPolygon, shape, point, box, Polygon
# from descartes import PolygonPatch
# from matplotlib.collections import PatchCollection
# from mpl_toolkits.basemap import Basemap
In [ ]:
# initialize ogh_meta
meta_file = dict(ogh.ogh_meta())
sorted(meta_file.keys())
In [ ]:
sorted(meta_file['dailymet_livneh2013'].keys())
Establish a secure connection with HydroShare by instantiating the hydroshare class that is defined within hs_utils. In addition to connecting with HydroShare, this command also sets and prints environment variables for several parameters that will be useful for saving work back to HydroShare.
In [ ]:
notebookdir = os.getcwd()
hs=hydroshare.hydroshare()
homedir = hs.getContentPath(os.environ["HS_RES_ID"])
os.chdir(homedir)
print('Data will be loaded from and save to:'+homedir)
If you are curious about where the data is being downloaded, click on the Jupyter Notebook dashboard icon to return to the File System view. The homedir directory location printed above is where you can find the data and contents you will download to a HydroShare JupyterHub server. At the end of this work session, you can migrate this data to the HydroShare iRods server as a Generic Resource.
This example uses a shapefile with the watershed boundary of the Sauk-Suiattle Basin, which is stored in HydroShare at the following url: https://www.hydroshare.org/resource/c532e0578e974201a0bc40a37ef2d284/.
The data for our processing routines can be retrieved using the getResourceFromHydroShare function by passing in the global identifier from the url above. In the next cell, we download this resource from HydroShare, and identify that the points in this resource are available for downloading gridded hydrometeorology data, based on the point shapefile at https://www.hydroshare.org/resource/ef2d82bf960144b4bfb1bae6242bcc7f/, which is for the extent of North America and includes the average elevation for each 1/16 degree grid cell. The file must include columns with station numbers, latitude, longitude, and elevation. The header of these columns must be FID, LAT, LONG_, and ELEV or RASTERVALU, respectively. The station numbers will be used for the remainder of the code to uniquely reference data from each climate station, as well as to identify minimum, maximum, and average elevation of all of the climate stations. The webserice is currently set to a URL for the smallest geographic overlapping extent - e.g. WRF for Columbia River Basin (to use a limit using data from a FTP service, treatgeoself() would need to be edited in observatory_gridded_hydrometeorology utility).
In [ ]:
"""
Sauk
"""
# Watershed extent
hs.getResourceFromHydroShare('c532e0578e974201a0bc40a37ef2d284')
sauk = hs.content['wbdhub12_17110006_WGS84_Basin.shp']
In [ ]:
# map the mappingfiles from usecase1
mappingfile1 = os.path.join(homedir,'Sauk_mappingfile.csv')
mappingfile2 = os.path.join(homedir,'Elwha_mappingfile.csv')
mappingfile3 = os.path.join(homedir,'RioSalado_mappingfile.csv')
t1 = ogh.mappingfileSummary(listofmappingfiles = [mappingfile1, mappingfile2, mappingfile3],
listofwatershednames = ['Sauk-Suiattle river','Elwha river','Upper Rio Salado'],
meta_file=meta_file)
t1
This section performs computations and generates plots of the Livneh 2013, Livneh 2016, and WRF 2014 temperature and precipitation data in order to compare them with each other and observations. The generated plots are automatically downloaded and saved as .png files in the "plots" folder of the user's home directory and inline in the notebook.
In [ ]:
# Livneh et al., 2013
dr1 = meta_file['dailymet_livneh2013']
# Salathe et al., 2014
dr2 = meta_file['dailywrf_salathe2014']
# define overlapping time window
dr = ogh.overlappingDates(date_set1=tuple([dr1['start_date'], dr1['end_date']]),
date_set2=tuple([dr2['start_date'], dr2['end_date']]))
dr
Data frames for each set of data are stored in a dictionary. The inputs to gridclim_dict() include the folder location and name of the hydrometeorology data, the file start and end, the analysis start and end, and the elevation band to be included in the analsyis (max and min elevation).
In [ ]:
%%time
ltm_3bands = ogh.gridclim_dict(mappingfile=mappingfile1,
metadata=meta_file,
dataset='dailymet_livneh2013',
file_start_date=dr1['start_date'],
file_end_date=dr1['end_date'],
file_time_step=dr1['temporal_resolution'],
subset_start_date=dr[0],
subset_end_date=dr[1])
In [ ]:
%%time
ltm_3bands = ogh.gridclim_dict(mappingfile=mappingfile1,
metadata=meta_file,
dataset='dailyvic_livneh2013',
file_start_date=dr1['start_date'],
file_end_date=dr1['end_date'],
file_time_step=dr1['temporal_resolution'],
subset_start_date=dr[0],
subset_end_date=dr[1],
df_dict=ltm_3bands)
In [ ]:
sorted(ltm_3bands.keys())
In [ ]:
meta = meta_file['dailyvic_livneh2013']['variable_info']
meta_df = pd.DataFrame.from_dict(meta).T
meta_df.loc[['BASEFLOW','RUNOFF'],:]
In [ ]:
ltm_3bands['STREAMFLOW_dailyvic_livneh2013']=ltm_3bands['BASEFLOW_dailyvic_livneh2013']+ltm_3bands['RUNOFF_dailyvic_livneh2013']
ltm_3bands['STREAMFLOW_dailyvic_livneh2013']
In [ ]:
"""
Sauk-Suiattle
"""
# Watershed extent
hs.getResourceFromHydroShare('c532e0578e974201a0bc40a37ef2d284')
sauk = hs.content['wbdhub12_17110006_WGS84_Basin.shp']
"""
Elwha
"""
# Watershed extent
hs.getResourceFromHydroShare('4aff8b10bc424250b3d7bac2188391e8', )
elwha = hs.content["elwha_ws_bnd_wgs84.shp"]
"""
Upper Rio Salado
"""
# Watershed extent
hs.getResourceFromHydroShare('5c041d95ceb64dce8eb85d2a7db88ed7')
riosalado = hs.content['UpperRioSalado_delineatedBoundary.shp']
In [ ]:
# generate surface area for each gridded cell
def computegcSurfaceArea(shapefile, spatial_resolution, vardf):
"""
Data-driven computation of gridded cell surface area using the list of gridded cells centroids
shapefile: (dir) the path to the study site shapefile for selecting the UTM boundary
spatial_resolution: (float) the spatial resolution in degree coordinate reference system e.g., 1/16
vardf: (dataframe) input dataframe that contains FID, LAT and LONG references for each gridded cell centroid
return: (mean surface area in meters-squared, standard deviation in surface area)
"""
# ensure projection into WGS84 longlat values
ogh.reprojShapefile(shapefile)
# generate the figure axis
fig = plt.figure(figsize=(2,2), dpi=500)
ax1 = plt.subplot2grid((1,1),(0,0))
# calculate bounding box based on the watershed shapefile
watershed = gpd.read_file(shapefile)
watershed['watershed']='watershed'
watershed = watershed.dissolve(by='watershed')
# extract area centroid, bounding box info, and dimension shape
lon0, lat0 = np.array(watershed.centroid.iloc[0])
minx, miny, maxx, maxy = watershed.bounds.iloc[0]
# generate traverse mercatur projection
m = Basemap(projection='tmerc', resolution='h', ax=ax1, lat_0=lat0, lon_0=lon0,
llcrnrlon=minx, llcrnrlat=miny, urcrnrlon=maxx, urcrnrlat=maxy)
# generate gridded cell bounding boxes
midpt_dist=spatial_resolution/2
cat=vardf.T.reset_index(level=[1,2]).rename(columns={'level_1':'LAT','level_2':'LONG_'})
geometry = cat.apply(lambda x:
shapely.ops.transform(m, box(x['LONG_']-midpt_dist, x['LAT']-midpt_dist,
x['LONG_']+midpt_dist, x['LAT']+midpt_dist)), axis=1)
# compute gridded cell area
gc_area = geometry.apply(lambda x: x.area)
plt.gcf().clear()
return(gc_area.mean(), gc_area.sd())
In [ ]:
gcSA = computeMeanSurfaceArea(shapefile=sauk, spatial_resolution=1/16, vardf=ltm_3bands['STREAMFLOW_dailyvic_livneh2013'])
gcSA
In [ ]:
# convert mm/s to m/s
df_dict = ltm_3bands
objname = 'STREAMFLOW_dailyvic_livneh2013'
dataset = objname.split('_',1)[1]
gridcell_area = geo_area
exceedance = 10
# convert mmps to mps
mmps = df_dict[objname]
mps = mmps*0.001
# multiply streamflow (mps) with grid cell surface area (m2) to produce volumetric streamflow (cms)
cms = mps.multiply(np.array(geo_area))
# convert m^3/s to cfs; multiply with (3.28084)^3
cfs = cms.multiply((3.28084)**3)
# output to df_dict
df_dict['cfs_'+objname] = cfs
# time-group by month-yearly streamflow volumetric values
monthly_cfs = cfs.groupby(pd.TimeGrouper('M')).sum()
monthly_cfs.index = pd.Series(monthly_cfs.index).apply(lambda x: x.strftime('%Y-%m'))
# output to df_dict
df_dict['monthly_cfs_'+objname] = monthly_cfs
# prepare for Exceedance computations
row_indices = pd.Series(monthly_cfs.index).map(lambda x: pd.datetime.strptime(x, '%Y-%m').month)
months = range(1,13)
Exceed = pd.DataFrame()
# for each month
for eachmonth in months:
month_index = row_indices[row_indices==eachmonth].index
month_res = monthly_cfs.iloc[month_index,:].reset_index(drop=True)
# generate gridded-cell-specific 10% exceedance probability values
exceed = pd.DataFrame(month_res.apply(lambda x: np.percentile(x, 0.90), axis=0)).T
# append to dataframe
Exceed = pd.concat([Exceed, exceed])
# set index to month order
Exceed = Exceed.set_index(np.array(months))
# output to df_dict
df_dict['EXCEED{0}_{1}'.format(exceedance,dataset)] = Exceed
# return(df_dict)
In [ ]:
def monthlyExceedence_cfs (df_dict,
daily_streamflow_dfname,
gridcell_area,
exceedance=10,
start_date=None,
end_date=None):
"""
streamflow_df: (dataframe) streamflow values in cu. ft per second (row_index: date, col_index: gridcell ID)
daily_baseflow_df: (dataframe) groundwater flow in cu. ft per second (row_index: date, col_index: gridcell ID)
daily_surfacerunoff_df: (dateframe) surface flow in cu. ft per second (row_index: date, col_index: gridcell ID)
start_date: (datetime) start date
end_date: (datetime) end date
"""
#daily_baseflow_df=None, #'BASEFLOW_dailyvic_livneh2013',
#daily_surfacerunoff_df=None, #'RUNOFF_dailyvic_livneh2013',
## aggregate each daily streamflow value to a month-yearly sum
mmps = df_dict[daily_streamflow_dfname]
## subset daily streamflow_df to that index range
if isinstance(start_date, type(None)):
startyear=0
if isinstance(end_date, type(None)):
endyear=len(mmps)-1
mmps = mmps.iloc[start_date:end_date,:]
# convert mmps to mps
mps = mmps*0.001
# multiply streamflow (mps) with grid cell surface area (m2) to produce volumetric streamflow (cms)
cms = mps.multiply(np.array(geo_area))
# convert m^3/s to cfs; multiply with (3.28084)^3
cfs = cms.multiply((3.28084)**3)
# output to df_dict
df_dict['cfs_'+objname] = cfs
# time-group by month-yearly streamflow volumetric values
monthly_cfs = cfs.groupby(pd.TimeGrouper('M')).sum()
monthly_cfs.index = pd.Series(monthly_cfs.index).apply(lambda x: x.strftime('%Y-%m'))
# output to df_dict
df_dict['monthly_cfs_'+objname] = monthly_cfs
monthly_streamflow_df = daily_streamflow_df.groupby(pd.TimeGrouper("M")).sum()
# loop through each station
for eachcol in monthly_streamflow_df.columns():
station_moyr = dask.delayed(monthly_streamflow_df.loc[:,eachcol])
station_moyr
df_dict = ltm_3bands
objname = 'STREAMFLOW_dailyvic_livneh2013'
dataset = objname.split('_',1)[1]
gridcell_area = geo_area
exceedance = 10
ltm_3bands['STREAMFLOW_dailyvic_livneh2013']=ltm_3bands['BASEFLOW_dailyvic_livneh2013']+ltm_3bands['RUNOFF_dailyvic_livneh2013']
# function [Qex] = monthlyExceedence_cfs(file,startyear,endyear)
# % Load data from specified file
# data = load(file);
# Y=data(:,1);
# MO=data(:,2);
# D=data(:,3);
# t = datenum(Y,MO,D);
# %%%
# % startyear=data(1,1);
# % endyear=data(length(data),1);
# Qnode = data(:,4);
# % Time control indices that cover selected period
# d_start = datenum(startyear,10,01,23,0,0); % 1 hour early to catch record for that day
# d_end = datenum(endyear,09,30,24,0,0);
# idx = find(t>=d_start & t<=d_end);
# exds=(1:19)./20; % Exceedence probabilities from 0.05 to 0.95
# mos=[10,11,12,1:9];
# Qex(1:19,1:12)=0 ; % initialize array
# for imo=1:12;
# mo=mos(imo);
# [Y, M, D, H, MI, S] = datevec(t(idx));
# ind=find(M==mo); % find all flow values in that month in the period identified
# Q1=Qnode(idx(ind)); % input is in cfs
# for iex=1:19
# Qex(iex,imo)=quantile(Q1,1-exds(iex));
# end
# end
# end
In [ ]:
ltm_3bands['cfs_STREAMFLOW_dailyvic_livneh2013']
In [ ]:
ltm_3bands['monthly_cfs_STREAMFLOW_dailyvic_livneh2013']
In [ ]:
ltm_3bands['EXCEED10_dailyvic_livneh2013']
In [ ]:
# loop through each month to compute the 10% Exceedance Probability
for eachmonth in range(1,13):
monthlabel = pd.datetime.strptime(str(eachmonth), '%m')
ogh.renderValuesInPoints(vardf=ltm_3bands['EXCEED10_dailyvic_livneh2013'],
vardf_dateindex=eachmonth,
shapefile=sauk.replace('.shp','_2.shp'),
outfilepath='sauk{0}exceed10.png'.format(monthlabel.strftime('%b')),
plottitle='Sauk {0} 10% Exceedance Probability'.format(monthlabel.strftime('%B')),
colorbar_label='cubic feet per second',
cmap='seismic_r')
In [ ]:
%%time
month=3
monthlabel = pd.datetime.strptime(str(month), '%m')
ogh.renderValuesInPoints(vardf=ltm_3bands['month_PRECIP_dailymet_livneh2013'],
vardf_dateindex=month,
shapefile=sauk.replace('.shp','_2.shp'),
outfilepath=os.path.join(homedir, 'SaukPrecip{0}.png'.format(monthlabel.strftime('%b'))),
plottitle='Sauk-Suiattle watershed'+'\nPrecipitation in '+ monthlabel.strftime('%B'),
colorbar_label='Average monthly precipitation (meters)',
spatial_resolution=1/16, margin=0.5, epsg=3857,
basemap_image='Demographics/USA_Social_Vulnerability_Index',
cmap='gray_r')
month=6
monthlabel = pd.datetime.strptime(str(month), '%m')
ogh.renderValuesInPoints(vardf=ltm_3bands['month_PRECIP_dailymet_livneh2013'],
vardf_dateindex=month,
shapefile=sauk.replace('.shp','_2.shp'),
outfilepath=os.path.join(homedir, 'SaukPrecip{0}.png'.format(monthlabel.strftime('%b'))),
plottitle='Sauk-Suiattle watershed'+'\nPrecipitation in '+ monthlabel.strftime('%B'),
colorbar_label='Average monthly precipitation (meters)',
spatial_resolution=1/16, margin=0.5, epsg=3857,
basemap_image='ESRI_StreetMap_World_2D',
cmap='gray_r')
month=9
monthlabel = pd.datetime.strptime(str(month), '%m')
ogh.renderValuesInPoints(vardf=ltm_3bands['month_PRECIP_dailymet_livneh2013'],
vardf_dateindex=month,
shapefile=sauk.replace('.shp','_2.shp'),
outfilepath=os.path.join(homedir, 'SaukPrecip{0}.png'.format(monthlabel.strftime('%b'))),
plottitle='Sauk-Suiattle watershed'+'\nPrecipitation in '+ monthlabel.strftime('%B'),
colorbar_label='Average monthly precipitation (meters)',
spatial_resolution=1/16, margin=0.5, epsg=3857,
basemap_image='ESRI_Imagery_World_2D',
cmap='gray_r')
month=12
monthlabel = pd.datetime.strptime(str(month), '%m')
ogh.renderValuesInPoints(vardf=ltm_3bands['month_PRECIP_dailymet_livneh2013'],
vardf_dateindex=month,
shapefile=sauk.replace('.shp','_2.shp'),
outfilepath=os.path.join(homedir, 'SaukPrecip{0}.png'.format(monthlabel.strftime('%b'))),
plottitle='Sauk-Suiattle watershed'+'\nPrecipitation in '+ monthlabel.strftime('%B'),
colorbar_label='Average monthly precipitation (meters)',
spatial_resolution=1/16, margin=0.5, epsg=3857,
basemap_image='Elevation/World_Hillshade',
cmap='seismic_r')
In [ ]:
for month in [3, 6, 9, 12]:
monthlabel = pd.datetime.strptime(str(month), '%m')
outfile='SaukLivnehPrecip{0}.png'.format(monthlabel.strftime('%b'))
ax1 = ogh.renderValuesInPoints(vardf=ltm_3bands['month_PRECIP_dailymet_livneh2013'],
vardf_dateindex=month,
shapefile=sauk.replace('.shp','_2.shp'),
basemap_image='ESRI_Imagery_World_2D',
cmap='seismic_r',
plottitle='Sauk-Suiattle watershed'+'\nPrecipitation in '+monthlabel.strftime('%B'),
colorbar_label='Average monthly precipitation (meters)',
outfilepath=os.path.join(homedir, outfile))
In [ ]:
%%time
ltm_3bands = ogh.gridclim_dict(mappingfile=mappingfile1,
metadata=meta_file,
dataset='dailywrf_salathe2014',
colvar=None,
file_start_date=dr2['start_date'],
file_end_date=dr2['end_date'],
file_time_step=dr2['temporal_resolution'],
subset_start_date=dr[0],
subset_end_date=dr[1],
df_dict=ltm_3bands)
In [ ]:
for month in [3, 6, 9, 12]:
monthlabel = pd.datetime.strptime(str(month), '%m')
outfile='SaukSalathePrecip{0}.png'.format(monthlabel.strftime('%b'))
ax1 = ogh.renderValuesInPoints(vardf=ltm_3bands['month_PRECIP_dailywrf_salathe2014'],
vardf_dateindex=month,
shapefile=sauk.replace('.shp','_2.shp'),
basemap_image='ESRI_Imagery_World_2D',
cmap='seismic_r',
plottitle='Sauk-Suiattle watershed'+'\nPrecipitation in '+monthlabel.strftime('%B'),
colorbar_label='Average monthly precipitation (meters)',
outfilepath=os.path.join(homedir, outfile))
In [ ]:
In [ ]:
def plot_meanTmin(dictionary, loc_name, start_date, end_date):
# Plot 1: Monthly temperature analysis of Livneh data
if 'meanmonth_temp_min_liv2013_met_daily' and 'meanmonth_temp_min_wrf2014_met_daily' not in dictionary.keys():
pass
# generate month indices
wy_index=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
wy_numbers=[10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9]
month_strings=[ 'Oct', 'Nov', 'Dec', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sept']
# initiate the plot object
fig, ax=plt.subplots(1,1,figsize=(10, 6))
if 'meanmonth_temp_min_liv2013_met_daily' in dictionary.keys():
# Liv2013
plt.plot(wy_index, dictionary['meanmonth_temp_min_liv2013_met_daily'][wy_numbers],'r-', linewidth=1, label='Liv Temp min')
if 'meanmonth_temp_min_wrf2014_met_daily' in dictionary.keys():
# WRF2014
plt.plot(wy_index, dictionary['meanmonth_temp_min_wrf2014_met_daily'][wy_numbers],'b-',linewidth=1, label='WRF Temp min')
if 'meanmonth_temp_min_livneh2013_wrf2014bc_met_daily' in dictionary.keys():
# WRF2014
plt.plot(wy_index, dictionary['meanmonth_temp_min_livneh2013_wrf2014bc_met_daily'][wy_numbers],'g-',linewidth=1, label='WRFbc Temp min')
# add reference line at y=0
plt.plot([1, 12],[0, 0], 'k-',linewidth=1)
plt.ylabel('Temp (C)',fontsize=14)
plt.xlabel('Month',fontsize=14)
plt.xlim(1,12);
plt.xticks(wy_index, month_strings);
plt.tick_params(labelsize=12)
plt.legend(loc='best')
plt.grid(which='both')
plt.title(str(loc_name)+'\nMinimum Temperature\n Years: '+str(start_date.year)+'-'+str(end_date.year)+'; Elevation: '+str(dictionary['analysis_elev_min'])+'-'+str(dictionary['analysis_elev_max'])+'m', fontsize=16)
plt.savefig('monthly_Tmin'+str(loc_name)+'.png')
plt.show()
If you want to plot observed snotel point precipitation or temperature with the gridded climate data, set to 'Y' Give name of Snotel file and name to be used in figure legends. File format: Daily SNOTEL Data Report - Historic - By individual SNOTEL site, standard sensors (https://www.wcc.nrcs.usda.gov/snow/snotel-data.html)
In [ ]:
# Sauk
SNOTEL_file = os.path.join(homedir,'ThunderBasinSNOTEL.txt')
SNOTEL_station_name='Thunder Creek'
SNOTEL_file_use_colsnames = ['Date','Air Temperature Maximum (degF)', 'Air Temperature Minimum (degF)','Air Temperature Average (degF)','Precipitation Increment (in)']
SNOTEL_station_elev=int(4320/3.281) # meters
SNOTEL_obs_daily = ogh.read_daily_snotel(file_name=SNOTEL_file,
usecols=SNOTEL_file_use_colsnames,
delimiter=',',
header=58)
# generate the start and stop date
SNOTEL_obs_start_date=SNOTEL_obs_daily.index[0]
SNOTEL_obs_end_date=SNOTEL_obs_daily.index[-1]
# peek
SNOTEL_obs_daily.head(5)
In [ ]:
COOP_file=os.path.join(homedir, 'USC00455678.csv') # Sauk
COOP_station_name='Mt Vernon'
COOP_file_use_colsnames = ['DATE','PRCP','TMAX', 'TMIN','TOBS']
COOP_station_elev=int(4.3) # meters
COOP_obs_daily = ogh.read_daily_coop(file_name=COOP_file,
usecols=COOP_file_use_colsnames,
delimiter=',',
header=0)
# generate the start and stop date
COOP_obs_start_date=COOP_obs_daily.index[0]
COOP_obs_end_date=COOP_obs_daily.index[-1]
# peek
COOP_obs_daily.head(5)
In [ ]:
#initiate new dictionary with original data
ltm_0to3000 = ogh.gridclim_dict(metadata=meta_file,
mappingfile=mappingfile1,
dataset='dailymet_livneh2013',
file_start_date=dr1['start_date'],
file_end_date=dr1['end_date'],
subset_start_date=dr[0],
subset_end_date=dr[1])
ltm_0to3000 = ogh.gridclim_dict(metadata=meta_file,
mappingfile=mappingfile1,
dataset='dailywrf_salathe2014',
file_start_date=dr2['start_date'],
file_end_date=dr2['end_date'],
subset_start_date=dr[0],
subset_end_date=dr[1],
df_dict=ltm_0to3000)
sorted(ltm_0to3000.keys())
In [ ]:
# read in the mappingfile
mappingfile = mappingfile1
mapdf = pd.read_csv(mappingfile)
# select station by first FID
firstStation = ogh.findStationCode(mappingfile=mappingfile, colvar='FID', colvalue=0)
# select station by elevation
maxElevStation = ogh.findStationCode(mappingfile=mappingfile, colvar='ELEV', colvalue=mapdf.loc[:,'ELEV'].max())
medElevStation = ogh.findStationCode(mappingfile=mappingfile, colvar='ELEV', colvalue=mapdf.loc[:,'ELEV'].median())
minElevStation = ogh.findStationCode(mappingfile=mappingfile, colvar='ELEV', colvalue=mapdf.loc[:,'ELEV'].min())
# print(firstStation, mapdf.iloc[0].ELEV)
# print(maxElevStation, mapdf.loc[:,'ELEV'].max())
# print(medElevStation, mapdf.loc[:,'ELEV'].median())
# print(minElevStation, mapdf.loc[:,'ELEV'].min())
# let's compare monthly averages for TMAX using livneh, salathe, and the salathe-corrected livneh
comp = ['month_TMAX_dailymet_livneh2013',
'month_TMAX_dailywrf_salathe2014']
obj = dict()
for eachkey in ltm_0to3000.keys():
if eachkey in comp:
obj[eachkey] = ltm_0to3000[eachkey]
panel_obj = pd.Panel.from_dict(obj)
panel_obj
In [ ]:
comp = ['meanmonth_TMAX_dailymet_livneh2013',
'meanmonth_TMAX_dailywrf_salathe2014']
obj = dict()
for eachkey in ltm_0to3000.keys():
if eachkey in comp:
obj[eachkey] = ltm_0to3000[eachkey]
df_obj = pd.DataFrame.from_dict(obj)
df_obj
In [ ]:
t_res, var, dataset, pub = each.rsplit('_',3)
print(t_res, var, dataset, pub)
In [ ]:
ylab_var = meta_file['_'.join([dataset, pub])]['variable_info'][var]['desc']
ylab_unit = meta_file['_'.join([dataset, pub])]['variable_info'][var]['units']
print('{0} {1} ({2})'.format(t_res, ylab_var, ylab_unit))
In [ ]:
%%time
comp = [['meanmonth_TMAX_dailymet_livneh2013','meanmonth_TMAX_dailywrf_salathe2014'],
['meanmonth_PRECIP_dailymet_livneh2013','meanmonth_PRECIP_dailywrf_salathe2014']]
wy_numbers=[10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9]
month_strings=[ 'Oct', 'Nov', 'Dec', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep']
fig = plt.figure(figsize=(20,5), dpi=500)
ax1 = plt.subplot2grid((2, 2), (0, 0), colspan=1)
ax2 = plt.subplot2grid((2, 2), (1, 0), colspan=1)
# monthly
for eachsumm in df_obj.columns:
ax1.plot(df_obj[eachsumm])
ax1.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05), fancybox=True, shadow=True, ncol=2, fontsize=10)
plt.show()
In [ ]:
In [ ]:
df_obj[each].index.apply(lambda x: x+2)
In [ ]:
fig, ax = plt.subplots()
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))
lws=[3, 10, 3, 3]
styles=['b--','go-','y--','ro-']
for col, style, lw in zip(comp, styles, lws):
panel_obj.xs(key=(minElevStation[0][0], minElevStation[0][1], minElevStation[0][2]), axis=2)[col].plot(style=style, lw=lw, ax=ax, legend=True)
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05), fancybox=True, shadow=True, ncol=2)
fig.show()
fig, ax = plt.subplots()
lws=[3, 10, 3, 3]
styles=['b--','go-','y--','ro-']
for col, style, lw in zip(comp, styles, lws):
panel_obj.xs(key=(maxElevStation[0][0], maxElevStation[0][1], maxElevStation[0][2]),
axis=2)[col].plot(style=style, lw=lw, ax=ax, legend=True)
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05), fancybox=True, shadow=True, ncol=2)
fig.show()
In [ ]:
vic_dr1 = meta_file['dailyvic_livneh2013']['date_range']
vic_dr2 = meta_file['dailyvic_livneh2015']['date_range']
vic_dr = ogh.overlappingDates(tuple([vic_dr1['start'], vic_dr1['end']]),
tuple([vic_dr2['start'], vic_dr2['end']]))
vic_ltm_3bands = ogh.gridclim_dict(mappingfile=mappingfile,
metadata=meta_file,
dataset='dailyvic_livneh2013',
file_start_date=vic_dr1['start'],
file_end_date=vic_dr1['end'],
file_time_step=vic_dr1['time_step'],
subset_start_date=vic_dr[0],
subset_end_date=vic_dr[1])
In [ ]:
sorted(vic_ltm_3bands.keys())
Using the hs_utils
library, the results of the Geoprocessing steps above can be saved back into HydroShare. First, define all of the required metadata for resource creation, i.e. title, abstract, keywords, content files. In addition, we must define the type of resource that will be created, in this case genericresource.
Note: Make sure you save the notebook at this point, so that all notebook changes will be saved into the new HydroShare resource.
In [ ]:
#execute this cell to list the content of the directory
!ls -lt
Create list of files to save to HydroShare. Verify location and names.
In [ ]:
!tar -zcf {climate2013_tar} livneh2013
!tar -zcf {climate2015_tar} livneh2015
!tar -zcf {wrf_tar} salathe2014
In [ ]:
ThisNotebook='Observatory_Sauk_TreatGeoSelf.ipynb' #check name for consistency
climate2013_tar = 'livneh2013.tar.gz'
climate2015_tar = 'livneh2015.tar.gz'
wrf_tar = 'salathe2014.tar.gz'
mappingfile = 'Sauk_mappingfile.csv'
files=[ThisNotebook, mappingfile, climate2013_tar, climate2015_tar, wrf_tar]
In [ ]:
# for each file downloaded onto the server folder, move to a new HydroShare Generic Resource
title = 'Results from testing out the TreatGeoSelf utility'
abstract = 'This the output from the TreatGeoSelf utility integration notebook.'
keywords = ['Sauk', 'climate', 'Landlab','hydromet','watershed']
rtype = 'genericresource'
# create the new resource
resource_id = hs.createHydroShareResource(abstract,
title,
keywords=keywords,
resource_type=rtype,
content_files=files,
public=False)