Notebook to TreatGeoSelf with gridded climate data set coordinates

Case study: the Sauk-Suiattle river watershed, the Elwha river watershed, the Upper Rio Salado watershed

Use this Jupyter Notebook to:

1. HydroShare setup and preparation
2. Get list of gridded cell centroids
3. Get shapefiles that describe the study site polygons
4. Map gridded cell centroids that intersect with the study sites
5. Save results back into HydroShare






This data is compiled to digitally observe the watersheds, powered by HydroShare.
Provided by the Watershed Dynamics Group, Dept. of Civil and Environmental Engineering, University of Washington

1. Setup and prepare HydroShare

First, we import several Python libaries. to 1) map file paths and directory management, 2) apply spatial operations and gridded data product processing, 3) migrate files from hydroshare resources (data sharing environment) in and out of the jupyterhub (computing environment), and 4) enable visualizations inline. Default warning messages are silenced using the warnings library.

Note: In this section, 'homedir' is named to retain the active file directory where printed files are located. 'homedir' is used to formally name the destination directory of the output file. This is seen in generating the mapping files, multi-site figures, and gridded cell gradient figures.

Note: If the python library basemap-data-hires is not installed, please uncomment and run the following code chunk.


In [1]:
# path and directory management
import os

# gridded data processing
import ogh

# data migration library
from utilities import hydroshare

# plotting and shape libraries
%matplotlib inline

# silencing warning
import warnings
warnings.filterwarnings("ignore")

Establish a secure connection with HydroShare by instantiating the hydroshare class that is defined within hs_utils. In addition to connecting with HydroShare, this command also sets and prints environment variables for several parameters that will be useful for saving work back to HydroShare.


In [2]:
hs=hydroshare.hydroshare()
homedir = hs.getContentPath(os.environ["HS_RES_ID"])
os.chdir(homedir)
print('Data will be loaded from and save to:'+homedir)


Adding the following system variables:
   HS_USR_NAME = jphuong
   HS_RES_ID = 87dc5742cf164126a11ff45c3307fd9d
   HS_RES_TYPE = compositeresource
   JUPYTER_HUB_IP = jupyter.cuahsi.org

These can be accessed using the following command: 
   os.environ[key]

   (e.g.)
   os.environ["HS_USR_NAME"]  => jphuong
Successfully established a connection with HydroShare
Data will be loaded from and save to:/home/jovyan/work/notebooks/data/87dc5742cf164126a11ff45c3307fd9d/87dc5742cf164126a11ff45c3307fd9d/data/contents

If you are curious about where the data is being downloaded, click on the Jupyter Notebook dashboard icon to return to the File System view. The homedir directory location printed above is where you can find the data and contents you will download to a HydroShare JupyterHub server. At the end of this work session, you can migrate this data to the HydroShare iRods server as a Generic Resource.

2. Get list of gridded cell centroids

The data for our processing routines can be retrieved using the getResourceFromHydroShare function by passing in the global identifier from the url above. In the next cell, we download this resource from HydroShare, and identify that the points in this resource are available for downloading gridded hydrometeorology data, based on the point shapefile at https://www.hydroshare.org/resource/ef2d82bf960144b4bfb1bae6242bcc7f/, which is for the extent of North America and includes the average elevation for each 1/16 degree grid cell. The file must include columns with station numbers, latitude, longitude, and elevation. The header of these columns must be FID, LAT, LONG_, and ELEV, respectively. The station numbers will be used for the remainder of the code to uniquely reference data from each climate station, as well as to identify minimum, maximum, and average elevation of all of the climate stations. The webserice is currently set to a URL for the smallest geographic overlapping extent - e.g. WRF for Columbia River Basin (to use a limit using data from a FTP service, treatgeoself() would need to be edited in observatory_gridded_hydrometeorology utility).


In [3]:
"""
1/16-degree Gridded cell centroids
"""
# List of available data
hs.getResourceFromHydroShare('ef2d82bf960144b4bfb1bae6242bcc7f')
NAmer = hs.content['NAmer_dem_list.shp']


This resource already exists in your userspace.
ef2d82bf960144b4bfb1bae6242bcc7f/
|-- ef2d82bf960144b4bfb1bae6242bcc7f/
|   |-- bagit.txt
|   |-- manifest-md5.txt
|   |-- readme.txt
|   |-- tagmanifest-md5.txt
|   |-- data/
|   |   |-- resourcemap.xml
|   |   |-- resourcemetadata.xml
|   |   |-- contents/
|   |   |   |-- NAmer_dem_list.cpg
|   |   |   |-- NAmer_dem_list.dbf
|   |   |   |-- NAmer_dem_list.prj
|   |   |   |-- NAmer_dem_list.sbn
|   |   |   |-- NAmer_dem_list.sbx
|   |   |   |-- NAmer_dem_list.shp
|   |   |   |-- NAmer_dem_list.shx

Do you want to overwrite these data [Y/n]? n
Found the following file(s) associated with this HydroShare resource.
NAmer_dem_list.cpg
NAmer_dem_list.dbf
NAmer_dem_list.prj
NAmer_dem_list.sbn
NAmer_dem_list.sbx
NAmer_dem_list.shp
NAmer_dem_list.shx
These files are stored in a dictionary called hs.content for your convenience. To access a file, simply issue the following command where MY_FILE is one of the files listed above:
hs.content["MY_FILE"] 

3. Get shapefiles that describe the study site polygons

This example uses a shapefile with the watershed boundary of the Sauk-Suiattle Basin, which is stored in HydroShare at the following url: https://www.hydroshare.org/resource/c532e0578e974201a0bc40a37ef2d284/. The shapefile with the watershed boundary for Elwha River Basin is stored at https://www.hydroshare.org/resource/4aff8b10bc424250b3d7bac2188391e8/. The shapefile for the Upper Rio Salado River Basin can be obtained from https://www.hydroshare.org/resource/5c041d95ceb64dce8eb85d2a7db88ed7/.


In [4]:
"""
Sauk-Suiattle
"""
# Watershed extent
hs.getResourceFromHydroShare('c532e0578e974201a0bc40a37ef2d284')
sauk = hs.content['wbdhub12_17110006_WGS84_Basin.shp']
ogh.reprojShapefile(sauk)

"""
Elwha
"""
# Watershed extent
hs.getResourceFromHydroShare('4aff8b10bc424250b3d7bac2188391e8', )
elwha = hs.content["elwha_ws_bnd_wgs84.shp"]
ogh.reprojShapefile(elwha)

"""
Upper Rio Salado
"""
# Watershed extent
hs.getResourceFromHydroShare('5c041d95ceb64dce8eb85d2a7db88ed7')
riosalado = hs.content['UpperRioSalado_delineatedBoundary.shp']
ogh.reprojShapefile(riosalado)


This resource already exists in your userspace.
c532e0578e974201a0bc40a37ef2d284/
|-- c532e0578e974201a0bc40a37ef2d284/
|   |-- bagit.txt
|   |-- manifest-md5.txt
|   |-- readme.txt
|   |-- tagmanifest-md5.txt
|   |-- data/
|   |   |-- resourcemap.xml
|   |   |-- resourcemetadata.xml
|   |   |-- contents/
|   |   |   |-- wbdhub12_17110006_WGS84_Basin.cpg
|   |   |   |-- wbdhub12_17110006_WGS84_Basin.shp
|   |   |   |-- wbdhub12_17110006_WGS84_Basin.shx
|   |   |   |-- wbdhub12_17110006_WGS84_Basin.dbf
|   |   |   |-- wbdhub12_17110006_WGS84_Basin.prj

Do you want to overwrite these data [Y/n]? n
Found the following file(s) associated with this HydroShare resource.
wbdhub12_17110006_WGS84_Basin.cpg
wbdhub12_17110006_WGS84_Basin.shp
wbdhub12_17110006_WGS84_Basin.shx
wbdhub12_17110006_WGS84_Basin.dbf
wbdhub12_17110006_WGS84_Basin.prj
These files are stored in a dictionary called hs.content for your convenience. To access a file, simply issue the following command where MY_FILE is one of the files listed above:
hs.content["MY_FILE"] 
This resource already exists in your userspace.
4aff8b10bc424250b3d7bac2188391e8/
|-- 4aff8b10bc424250b3d7bac2188391e8/
|   |-- bagit.txt
|   |-- manifest-md5.txt
|   |-- readme.txt
|   |-- tagmanifest-md5.txt
|   |-- data/
|   |   |-- resourcemap.xml
|   |   |-- resourcemetadata.xml
|   |   |-- contents/
|   |   |   |-- elwha_ws_bnd_wgs84.cpg
|   |   |   |-- elwha_ws_bnd_wgs84.shp
|   |   |   |-- elwha_ws_bnd_wgs84.shx
|   |   |   |-- elwha_ws_bnd_wgs84.dbf
|   |   |   |-- elwha_ws_bnd_wgs84.prj
|   |   |   |-- elwha_ws_bnd_wgs84_2.shp
|   |   |   |-- elwha_ws_bnd_wgs84_2.shx
|   |   |   |-- elwha_ws_bnd_wgs84_2.dbf
|   |   |   |-- elwha_ws_bnd_wgs84_2.cpg
|   |   |   |-- elwha_ws_bnd_wgs84_2.prj

Do you want to overwrite these data [Y/n]? n
Found the following file(s) associated with this HydroShare resource.
elwha_ws_bnd_wgs84.cpg
elwha_ws_bnd_wgs84.shp
elwha_ws_bnd_wgs84.shx
elwha_ws_bnd_wgs84.dbf
elwha_ws_bnd_wgs84.prj
elwha_ws_bnd_wgs84_2.shp
elwha_ws_bnd_wgs84_2.shx
elwha_ws_bnd_wgs84_2.dbf
elwha_ws_bnd_wgs84_2.cpg
elwha_ws_bnd_wgs84_2.prj
These files are stored in a dictionary called hs.content for your convenience. To access a file, simply issue the following command where MY_FILE is one of the files listed above:
hs.content["MY_FILE"] 
This resource already exists in your userspace.
5c041d95ceb64dce8eb85d2a7db88ed7/
|-- 5c041d95ceb64dce8eb85d2a7db88ed7/
|   |-- bagit.txt
|   |-- manifest-md5.txt
|   |-- readme.txt
|   |-- tagmanifest-md5.txt
|   |-- data/
|   |   |-- resourcemap.xml
|   |   |-- resourcemetadata.xml
|   |   |-- contents/
|   |   |   |-- UpperRioSalado_delineatedBoundary.cpg
|   |   |   |-- UpperRioSalado_delineatedBoundary.shp
|   |   |   |-- UpperRioSalado_delineatedBoundary.shx
|   |   |   |-- UpperRioSalado_delineatedBoundary.dbf
|   |   |   |-- UpperRioSalado_delineatedBoundary.prj
|   |   |   |-- UpperRioSalado_delineatedBoundary_2.shp
|   |   |   |-- UpperRioSalado_delineatedBoundary.shp.xml
|   |   |   |-- UpperRioSalado_delineatedBoundary_2.shx
|   |   |   |-- UpperRioSalado_delineatedBoundary_2.dbf
|   |   |   |-- UpperRioSalado_delineatedBoundary_2.cpg
|   |   |   |-- UpperRioSalado_delineatedBoundary_2.prj

Do you want to overwrite these data [Y/n]? n
Found the following file(s) associated with this HydroShare resource.
UpperRioSalado_delineatedBoundary.cpg
UpperRioSalado_delineatedBoundary.shp
UpperRioSalado_delineatedBoundary.shx
UpperRioSalado_delineatedBoundary.dbf
UpperRioSalado_delineatedBoundary.prj
UpperRioSalado_delineatedBoundary_2.shp
UpperRioSalado_delineatedBoundary.shp.xml
UpperRioSalado_delineatedBoundary_2.shx
UpperRioSalado_delineatedBoundary_2.dbf
UpperRioSalado_delineatedBoundary_2.cpg
UpperRioSalado_delineatedBoundary_2.prj
These files are stored in a dictionary called hs.content for your convenience. To access a file, simply issue the following command where MY_FILE is one of the files listed above:
hs.content["MY_FILE"] 

Visualize the Sauk-Suiattle river basin study site


In [5]:
v1 = ogh.multiSiteVisual(listOfShapefiles=[sauk], 
                         listOfNames=['Sauk-Suiattle river'],
                         multishape=os.path.join(homedir,'eachwatershed.shp'), 
                         singleshape=os.path.join(homedir,'allwatersheds.shp'), 
                         fileoutpath=os.path.join(homedir,'sauk_annotated_map.png'),
                         projection='merc', epsg=3857, polygon_color='m', margin=1, 
                         scale_x_dist=0, scale_y_dist=-0.35, scale_ref_length=80, scale_yoffset=5000,
                         text_x_dist=0, text_y_dist=0.35)


Visualize all study sites upon the landscape


In [6]:
annotated_map = os.path.join(homedir, 'all_map_annotated.png')

v2 = ogh.multiSiteVisual(listOfShapefiles=[sauk, elwha, riosalado],
                         listOfNames=['Sauk-Suiattle river','Elwha river','Upper Rio Salado'],
                         multishape=os.path.join(homedir, 'eachwatershed.shp'),
                         singleshape=os.path.join(homedir, 'allwatersheds.shp'),
                         fileoutpath=annotated_map,
                         projection='merc', epsg=3857, polygon_color='m', margin=0.2, 
                         scale_x_dist=-1, scale_y_dist=-12.5, scale_ref_length=500, scale_yoffset=25000,
                         text_x_dist=0, text_y_dist=-2.5)



In [7]:
annotated_map = os.path.join(homedir, 'all_map.png')

v2 = ogh.multiSiteVisual(listOfShapefiles=[sauk, elwha, riosalado],
                         listOfNames=['Sauk-Suiattle river','Elwha river','Upper Rio Salado'],
                         multishape=os.path.join(homedir, 'eachwatershed.shp'),
                         singleshape=os.path.join(homedir, 'allwatersheds.shp'),
                         fileoutpath=annotated_map,
                         projection='merc', epsg=3857, polygon_color='m', margin=0.2, 
                         scale_x_dist=-1, scale_y_dist=-12.5, scale_ref_length=500, scale_yoffset=25000,
                         text_x_dist=0, text_y_dist=-2.5,
                         annotate=False)



In [8]:
annotated_map = os.path.join(homedir, 'all_map_star.png')

v3 = ogh.multiSiteStar(listOfShapefiles=[sauk, elwha, riosalado],
                       listOfNames=['Sauk-Suiattle river','Elwha river','Upper Rio Salado'],
                       multishape=os.path.join(homedir, 'temp_eachwatershed.shp'),
                       singleshape=os.path.join(homedir, 'temp_allwatersheds.shp'),
                       fileoutpath=annotated_map,
                       projection='merc', epsg=3857, polygon_color='m', margin=0.2, 
                       scale_x_dist=-1, scale_y_dist=-12.5, scale_ref_length=500, scale_yoffset=25000,
                       text_x_dist=0, text_y_dist=-2)


4. Map Gridded Cell Centroids that intersect with the watersheds of interest


In [9]:
%%time 
# Generate list of stations to download
mappingfile1=ogh.treatgeoself(shapefile=sauk, NAmer=NAmer, buffer_distance=0.06,
                              mappingfile=os.path.join(homedir,'Sauk_mappingfile.csv'))
print(mappingfile1)


mappingfile2=ogh.treatgeoself(shapefile=elwha, NAmer=NAmer, buffer_distance=0.06, 
                              mappingfile=os.path.join(homedir,'Elwha_mappingfile.csv'))
print(mappingfile2)


mappingfile3=ogh.treatgeoself(shapefile=riosalado, NAmer=NAmer, buffer_distance=0.06, 
                              mappingfile=os.path.join(homedir,'RioSalado_mappingfile.csv'))
print(mappingfile3)


(99, 4)
   FID       LAT      LONG_    ELEV
0    0  48.53125 -121.59375  1113.0
1    1  48.46875 -121.46875   646.0
2    2  48.46875 -121.53125   321.0
3    3  48.46875 -121.59375   164.0
4    4  48.46875 -121.65625   369.0
/home/jovyan/work/notebooks/data/87dc5742cf164126a11ff45c3307fd9d/87dc5742cf164126a11ff45c3307fd9d/data/contents/Sauk_mappingfile.csv
(55, 4)
   FID       LAT      LONG_   ELEV
0    0  48.15625 -123.53125   36.0
1    1  48.15625 -123.59375   58.0
2    2  48.15625 -123.71875   97.0
3    3  48.09375 -123.40625  174.0
4    4  48.09375 -123.46875  229.0
/home/jovyan/work/notebooks/data/87dc5742cf164126a11ff45c3307fd9d/87dc5742cf164126a11ff45c3307fd9d/data/contents/Elwha_mappingfile.csv
(31, 4)
   FID       LAT      LONG_    ELEV
0    0  34.53125 -107.65625  2052.0
1    1  34.53125 -107.71875  2217.0
2    2  34.53125 -107.78125  2231.0
3    3  34.53125 -107.84375  2297.0
4    4  34.53125 -107.90625  2395.0
/home/jovyan/work/notebooks/data/87dc5742cf164126a11ff45c3307fd9d/87dc5742cf164126a11ff45c3307fd9d/data/contents/RioSalado_mappingfile.csv
CPU times: user 1min 7s, sys: 1.23 s, total: 1min 9s
Wall time: 1min 9s

Visualize the gridded cell elevation gradient of each study site


In [10]:
%%time 
# plot the watershed elevation gradient

for mappingfile, shp, sitename, outfilename in zip([mappingfile1, mappingfile2, mappingfile3],
                                                   [sauk, elwha, riosalado],
                                                   ['Sauk-Suiattle river','Elwha river','Upper Rio Salado'],
                                                   ['gcGradient_s.png','gcGradient_e.png','gcGradient_r.png']):

    # generate gridded Cell Gradient for the mappingfile ELEV variable
    ogh.griddedCellGradient(mappingfile=mappingfile, 
                            shapefile=shp,
                            outfilepath=os.path.join(homedir, outfilename),
                            plottitle='{0} watershed\nGridded cell elevation gradient'.format(sitename),
                            colorbar_label='Average elevation (meters)',
                            spatial_resolution=1/16, margin=0.25, epsg=3857, column='ELEV',
                            basemap_image='ESRI_Imagery_World_2D', cmap='terrain')


CPU times: user 50.1 s, sys: 3.91 s, total: 54 s
Wall time: 52.8 s

5. Save results back into HydroShare

Using the hs_utils library, the results of the Geoprocessing steps above can be saved back into HydroShare. First, define all of the required metadata for resource creation, i.e. title, abstract, keywords, content files. In addition, we must define the type of resource that will be created, in this case genericresource.

Note: Make sure you save the notebook at this point, so that all notebook changes will be saved into the new HydroShare resource.

Archive the downloaded data files for collaborative use

Create list of files to save to HydroShare. Verify location and names.


In [11]:
# mappingfiles
mappingfile1 # sauk
mappingfile2 # elwha
mappingfile3 # riosalado

# Multisite Map
annotated_map 

# Elevation gradient maps
sauk_gcG = os.path.join(homedir, 'gcGradient_s.png')
elwha_gcG = os.path.join(homedir, 'gcGradient_e.png')
riosalado_gcG = os.path.join(homedir, 'gcGradient_r.png')

In [ ]:
# Prepare for new HydroShare Generic Resource
title = 'TreatGeoSelf to study site mapping files'
abstract = 'These mapping files contain the 1/16-degree gridded cell centroids for each study site of interest. Each csv file was generated from intersecting the 1/16-degree Contiguous United States gridded cell centroids with the study site polygon shapefile. The mapping files describe each gridded cell as a row, the longitude (LONG_) and latitude (LAT) coordinates to five decimal digits, and the average elevation of each gridded cell area. The gridded cell gradients from each mapping file was rendered onto an ESRI Image World 2D basemap for visual reference.'
keywords = ['Sauk', 'Elwha','Rio Salado','gridded cell gradients','climate','hydromet','watershed'] 
rtype = 'genericresource'

# Files to migrate
files=[mappingfile1, mappingfile2, mappingfile3, annotated_map, sauk_gcG, elwha_gcG, riosalado_gcG]

# Create the new resource
resource_id = hs.createHydroShareResource(title=title,
                                          abstract=abstract, 
                                          keywords=keywords, 
                                          resource_type=rtype, 
                                          content_files=files, 
                                          public=False)