Illustrates querying all stations ("features") from a NERRS Reserve site; access to data from a NERRS station (specified by its station code); extraction of station metadata; and conversion of the returned multi-variable time series to a pandas DataFrame, followed by a time series plot from the DataFrame. Builds off the work from Dan Ramage (SECOORA), whose code is listed in the last cell, at the end. Note that this is running from a pyoos fork with some small but key changes to the nerrs collector. 2014 May 8-10. Emilio Mayorga.
In [1]:
from datetime import datetime, timedelta
import pandas as pd
from pyoos.collectors.nerrs.nerrs_soap import NerrsSoap
In [2]:
# FROM pyoos SOS handling
# Convenience function to build record style time series representation
def flatten_element(p):
rd = {'time':p.time}
for m in p.members:
rd[m['standard']] = m['value']
return rd
# sta.get_unique_members() serves the same function as the pyoos SOS get_unique_members method
# Convenience function to extract a dict of unique members (observed properties) by standard name
obsprops_bystdname = lambda sta: {m['standard']:m for m in sta.get_unique_members()}
NOTE: I manually removed (commented out) the NERRS/CDMO access token after running this notebook, before uploading notebook to my github gist. Replace 'TOKEN STRING' with a token obtained from the NERRS/CDMO office
In [3]:
# NERRS/CDMO access token.
accesstoken = 'TOKEN STRING'
# Initialize pyoos NERRS collector object
nerrsData = NerrsSoap()
In [4]:
# Access pdbpfmet station, for the last 7 days (roughly)
nerrsData.filter(features=['pdbpfmet'],
start=datetime.utcnow() - timedelta(days=7),
end=datetime.utcnow() - timedelta(hours=12))
response = nerrsData.collect(accesstoken)
In [5]:
sta = response.elements[0]
obsprops_bystdname_dict = obsprops_bystdname(sta)
In [6]:
# FROM pyoos SOS handling
# For first (and only) station
flattenedsta_0 = map(flatten_element, sta.elements)
sta_0df = pd.DataFrame.from_records(flattenedsta_0, index=['time'])
sta_0df.head()
Out[6]:
In [7]:
# Time series plot.
# "wind_speed" is currently mispelled; that's in pyoos, and can be fixed easily
obsprop_name = 'wind_sped'
obsprop = obsprops_bystdname_dict[obsprop_name]
sta_0df[obsprop_name].plot()
ylabel(obsprop_name + ' ('+obsprop['unit']+')');
In [8]:
# pyoos NERRS collector
nerrsData = NerrsSoap()
In [9]:
# Get all Padilla Bay stations (pdb)
[featureid for featureid in nerrsData.list_features() if featureid.startswith('pdb')]
Out[9]:
In [10]:
# Access pdbpfmet station, for the last 7 days (roughly)
nerrsData.filter(features=['pdbpfmet'],
start=datetime.utcnow() - timedelta(days=7),
end=datetime.utcnow() - timedelta(hours=12))
#nerrsData.filter(variables=["ATemp"])
response = nerrsData.collect()
In [11]:
# The raw response (a string) is not used outside this cell. The collect method response is what's used
# I'm showing the raw response here, just for reference
raw = nerrsData.raw()
type(raw), raw.keys()
Out[11]:
In [12]:
# response.elements is a one-element array with a paegan.cdm.dsg.features.station.Station element
response.elements
Out[12]:
In [13]:
# Looks like the station in the response doesn't include any info about the Reserve it belongs to. Too bad.
# Or is one of the pieces of information below the NERRS site?
sta = response.elements[0]
sta.__dict__.keys()
Out[13]:
In [14]:
sta.uid, sta.name, sta.description, sta.type, sta.location, sta.properties
Out[14]:
In [15]:
# 'siteid' and 'location_description' seem to refer to the NERRS reserve/site
sta.get_property('siteid'), sta._properties
Out[15]:
In [16]:
staloc = sta.get_location()
print staloc, '||', staloc.type, '||', staloc.xy
In [17]:
obsprops_bystdname_dict = obsprops_bystdname(sta)
obsprops_bystdname_dict['wind_sped']
Out[17]:
In [18]:
# The individual observations are returned in the station "elements"
stael = sta.elements
type(stael), len(stael)
Out[18]:
In [19]:
stael0 = stael[0]
stael0.time
# See sta.get_unique_members(), above
# stael0.get_member_names() returns a list of member names for this station 'element'
Out[19]:
In [20]:
stael0.members
Out[20]:
In [21]:
# From paegan: flatten Returns a Generator of Points that are part of this collection
# Just exploring what this does...
response.flatten
Out[21]:
In [22]:
# FROM pyoos SOS handling
# For first (and only) station
flattenedsta_0 = map(flatten_element, sta.elements)
sta_0df = pd.DataFrame.from_records(flattenedsta_0, index=['time'])
sta_0df.head()
Out[22]:
In [23]:
# Time series plot.
# "wind_speed" is currently mispelled; that's in pyoos, and can be fixed easily
obsprop_name = 'wind_sped'
obsprop = obsprops_bystdname_dict[obsprop_name]
sta_0df[obsprop_name].plot()
ylabel(obsprop_name + ' ('+obsprop['unit']+')');
Change this to return arrays and/or pandas data frames that pull out individual time series per variable
SEE THE CELL ABOVE THIS ONE!
for obsRec in response: for stationRec in response.get_elements():
#stationRec = obsRec.feature
print "**** Station: %s Location: %s" % (stationRec.name, stationRec.get_location())
#The elements are a list of the observed_properties returned wrapped in a Point object.
for obsProp in stationRec.get_elements():
print " -------------------"
print " - Observation Date/Time: %s" % (obsProp.get_time())
#print "Member names: %s" % (obsProp.get_member_names())
#I think that for a multi sensor request, there should be multiple members, each representing
#a specific observed_property.
for member in obsProp.get_members():
#Apparently you're going to have to know how each collector parses the pieces of the data.
#For an SOS query, there appear to be: name, units, value, and standard(CF MMI link).
#print " ------\n member.keys() = %s" % member.keys()
# member.keys() = ['value', 'description', 'name', 'unit', 'standard']
m = member
member_values_tup = (m['name'], m['description'], m['standard'], m['value'], m['unit'])
print "name: %s (description=%s, standard=%s); value=%s, unit=%s" % member_values_tup
#for key,value in member.iteritems():
# print " %s = %s" % (key, value)