1. HydroShare setup and preparation
2. Retrieve a mapping file (contains gridded cell centroids) for the study site of interest
3. Retrieve a folder of datafiles that were previously obtained for the study site of interest
4. Remap the file directories for each gridded cell centroid in the mapping file
To run this notebook, we must import several libaries. These are listed in order of 1) Python standard libraries, 2) hs_utils library provides functions for interacting with HydroShare, including resource querying, dowloading and creation, and 3) the observatory_gridded_hydromet library that is downloaded with this notebook.
If the python library basemap-data-hires is not installed, please uncomment and run the following lines in terminal.
In [ ]:
# data processing
import os
import ogh
import tarfile
# data migration library
from utilities import hydroshare
# silencing warning
# import warnings
# warnings.filterwarnings("ignore")
Establish a secure connection with HydroShare by instantiating the hydroshare class that is defined within hs_utils. In addition to connecting with HydroShare, this command also sets and prints environment variables for several parameters that will be useful for saving work back to HydroShare.
In [ ]:
hs=hydroshare.hydroshare()
homedir = hs.getContentPath(os.environ["HS_RES_ID"])
os.chdir(homedir)
print('Data will be loaded from and save to:'+homedir)
If you are curious about where the data is being downloaded, click on the Jupyter Notebook dashboard icon to return to the File System view. The homedir directory location printed above is where you can find the data and contents you will download to a HydroShare JupyterHub server. At the end of this work session, you can migrate this data to the HydroShare iRods server as a Generic Resource.
Here, we will retrieve two data objects then catalog the files within the mapping file. The Hydroshare resource 'https://www.hydroshare.org/resource/3629f2d5315b48fdb8eb851c1dd9ce63/' contains the mapping file for a test study site. The zipfolder contains the WRF ASCII files (described in Salathe et al. 2014) from a previous data download run, and may contain more files than is necessary for the study of our study site.
First, we will need to migrate these objects into the computing environment, and designate their path directories. Then, we will unzip the zipfolder, then catalog the two data products into the mappingfile under two dataset names.
In [ ]:
"""
Sample mapping file and previously downloaded files
"""
# List of available data
hs.getResourceFromHydroShare('3629f2d5315b48fdb8eb851c1dd9ce63')
folderpath = hs.getContentPath('3629f2d5315b48fdb8eb851c1dd9ce63') # the folder
mappingfile1 = os.path.abspath(hs.content['Sauk_mappingfile.csv']) # the mapping file in the folder
zipfolder = os.path.abspath(hs.content['salathe2014.tar.gz']) # the zipfolder in the folder
In [ ]:
os.listdir(folderpath)
In [ ]:
tar = tarfile.open(zipfolder)
tar.extractall(path=folderpath) # untar file into same directory
tar.close()
os.remove(zipfolder)
os.listdir(folderpath)
In [ ]:
mapdf, nstations = ogh.mappingfileToDF(os.path.abspath(mappingfile1))
In [ ]:
help(ogh.remapCatalog)
In [ ]:
# dailywrf_salathe2014
ogh.remapCatalog(mappingfile=mappingfile1,
catalog_label='dailywrf_salathe2014',
homedir=folderpath,
subdir='salathe2014/WWA_1950_2010/raw')
# dailywrf_bcsalathe2014
ogh.remapCatalog(mappingfile=mappingfile1,
catalog_label='dailywrf_bcsalathe2014',
homedir=folderpath,
subdir='salathe2014/WWA_1950_2010/bc')
In [ ]:
mapdf, nstations = ogh.mappingfileToDF(os.path.abspath(mappingfile1))