The WPS flyingpigeon provides several processes to perform spatial subsets of netCDF files:
Note that subsetting operations do not average or reduce results over the selected area, they only return the grid sells intersecting with the area. A mask is applied to cells that are outside the selected area, but inside the rectangular grid required for netCDF files.
In [1]:
# Import the WPS client and connect to the server
from birdy import WPSClient
import birdy
from os import environ
# To display Images from an url
from IPython.core.display import HTML
from IPython.display import Image
# wait until WPS process is finished
import time
In [2]:
# This cell is for server admnistration test purpose
# Ignore this cell and modify the following cell according to your needs
fp_server = environ.get('WPS_URL')
print(fp_server) # link to the flyingpigoen server
In [3]:
# URL to a flyingpigeon server
# fp_server = 'https://pavics.ouranos.ca/twitcher/ows/proxy/flyginpigeon/wps'
Connect to the server with birdy client
In [4]:
# Simple connection (not recommended for larger processing)
fp = WPSClient(fp_server)
# Asyncronous connection with progess status requests
fp_i = WPSClient(url=fp_server, progress=True)
Explore the available processes:
In [5]:
# Enter `fp?` for general exploration on processes provided by flyingpigeon
# fp?
# Or get help for a process in particular
help(fp.subset_continents) # or type: fp.subset_countries?
In [11]:
# some test data
# 'https://github.com/roocs/mini-esgf-data/raw/master/test_data/group_workspaces/jasmin2/cp4cds1/data/c3s-cordex/output/EUR-11/IPSL/MOHC-HadGEM2-ES/rcp85/r1i1p1/IPSL-WRF381P/v1/day/psl/v20190212/'
# 'psl_EUR-11_MOHC-HadGEM2-ES_rcp85_r1i1p1_IPSL-WRF381P_v1_day_20060101-20101231.nc'
# Specify input files
base_url = 'https://www.esrl.noaa.gov/psd/thredds/fileServer/Datasets/ncep.reanalysis.dailyavgs/surface/'
urls = [base_url + f'slp.{year}.nc' for year in range(2000, 2002)]
In [7]:
# bbox subset of one file without interaction to the server
resp = fp.subset_bbox(
resource=urls[0], # can be also a list of files but it is recommended to run large processes with fp_i
lon0=20,
lon1=70,
lat0=10,
lat1=50,
start=None, # optional to select a time periode
end=None,
variable=None, # can be set if the variable is known, otherwise the process detects the variable name
)
# Check the output files:
bbox = resp.get()
In [8]:
# bbox of multiple files
resp = fp_i.subset_bbox(
resource=urls,
lon0=20,
lon1=70,
lat0=10,
lat1=50,
start=None, # optional to select a time periode
end=None,
variable=None, # can be set if the variable is known, otherwise the process detects the variable name
)
# using fp_i you need to wait until the processing is complete!
timeout = time.time() + 60*5 # 5 minutes from now
while resp.getStatus() != 'ProcessSucceeded':
time.sleep(1)
if time.time() > timeout: # to avoid endless waiting if the process failed
break
# Get the output files:
bbox = resp.get()
In [9]:
# Plot the test file with the flyingpigeon plot function
out_plot = fp.plot_map_timemean(resource=bbox.output)
# and display the output
Image(url= out_plot.get()[0], width=400)
Out[9]:
In [12]:
# If you want to download the result files:
# Download files stored in a metalink
# requires https://github.com/metalink-dev/pymetalink
#
from metalink import download
import tempfile
path = tempfile.mkdtemp()
files = download.get(bbox.metalink, path=path, segmented=False, force=True)
len(files)
Out[12]:
As for bounding box subsetting, all netCDF input files are subsetted individually. Here, it is possible to specifiy one or multiple continents. If mosaic is True, all continents are merged into one shape before subsetting. If False, each input file is subsetted over each polygon, such that the number of output files is the number of input files times the number of continents.
The subset_countries process works the same with polygons for countries instead of continents.
There are two outputs:
In [13]:
# Run the process
resp = fp_i.subset_continents(resource=urls, region=['Europe', 'Africa'], mosaic=True)
# using fp_i you need to wait until the processing is complete!
timeout = time.time() + 60*5 # 5 minutes from now
while resp.getStatus() != 'ProcessSucceeded':
time.sleep(1)
if time.time() > timeout: # to avoid endless waiting if the process failed
break
# Check the output files:
cont = resp.get()
In [14]:
# plot the test file (url of output) with the flyingpigon plot function
resp = fp.plot_map_timemean(resource=cont.output)
In [15]:
# The plot process returnes one graphic file.
# It can be plotted directly by asking birdy to get python objects, instead of links to files.
resp.get(asobj=True).plotout_map
Out[15]:
In [16]:
inputs = dict(nc="https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/dodsC/birdhouse/nrcan"
"/nrcan_northamerica_monthly"
"/tasmax/nrcan_northamerica_monthly_2015_tasmax.nc",
typename="public:USGS_HydroBASINS_lake_na_lev12",
geoserver="https://pavics.ouranos.ca/geoserver/wfs",
fid1="USGS_HydroBASINS_lake_na_lev12.67061",
fid2="USGS_HydroBASINS_lake_na_lev12.67088",
mosaic=False)
# resp = fp_i.subset_wfs_polygon(resource=inputs['nc'],
# typename=inputs['typename'],
# featureids=inputs['fid1'],
# geoserver=inputs['geoserver'],
# mosaic=False,
# start=None,
# end=None,
# variable=None,
# )
In [17]:
# # using fp_i you need to wait until the processing is complete!
# timeout = time.time() + 60*5 # 5 minutes from now
# while resp.getStatus() != 'ProcessSucceeded':
# time.sleep(1)
# if time.time() > timeout: # to avoid endless waiting if the process failed
# break
# outputs = resp.get()
In [ ]: