Accessing a NERRS station with Pyoos, via CDMO SOAP web services

Illustrates querying all stations ("features") from a NERRS Reserve site; access to data from a NERRS station (specified by its station code); extraction of station metadata; and conversion of the returned multi-variable time series to a pandas DataFrame, followed by a time series plot from the DataFrame. Builds off the work from Dan Ramage (SECOORA), whose code is listed in the last cell, at the end. Note that this is running from a pyoos fork with some small but key changes to the nerrs collector. 2014 May 8-10. Emilio Mayorga.


In [1]:
from datetime import datetime, timedelta
import pandas as pd
from pyoos.collectors.nerrs.nerrs_soap import NerrsSoap

In [2]:
# FROM pyoos SOS handling
# Convenience function to build record style time series representation
def flatten_element(p):
    rd = {'time':p.time}
    for m in p.members:
        rd[m['standard']] = m['value']
    return rd

# sta.get_unique_members() serves the same function as the pyoos SOS get_unique_members method
# Convenience function to extract a dict of unique members (observed properties) by standard name
obsprops_bystdname = lambda sta: {m['standard']:m for m in sta.get_unique_members()}

First here's a very compact set of statements to get and plot the data for a station. No exploratory side trips.

NOTE: I manually removed (commented out) the NERRS/CDMO access token after running this notebook, before uploading notebook to my github gist. Replace 'TOKEN STRING' with a token obtained from the NERRS/CDMO office


In [3]:
# NERRS/CDMO access token.
accesstoken = 'TOKEN STRING'

# Initialize pyoos NERRS collector object
nerrsData = NerrsSoap()

In [4]:
# Access pdbpfmet station, for the last 7 days (roughly)
nerrsData.filter(features=['pdbpfmet'],
                  start=datetime.utcnow() - timedelta(days=7),
                  end=datetime.utcnow()  - timedelta(hours=12))

response = nerrsData.collect(accesstoken)

In [5]:
sta = response.elements[0]
obsprops_bystdname_dict = obsprops_bystdname(sta)

In [6]:
# FROM pyoos SOS handling
# For first (and only) station
flattenedsta_0 = map(flatten_element, sta.elements)
sta_0df = pd.DataFrame.from_records(flattenedsta_0, index=['time'])
sta_0df.head()


Out[6]:
air_pressure air_temperature cumulative_precipitation relative_humidity total_par_LiCor total_precipitation wind_direction_from_true_north wind_direction_standard_deviation wind_sped wind_speed_of_gust
time
2014-05-10 19:30:00+00:00 1021 12.1 0 81 651 0 212 22 1.8 3.1
2014-05-10 19:15:00+00:00 1021 11.6 0 83 462 0 192 17 2.0 3.7
2014-05-10 19:00:00+00:00 1021 11.4 0 83 394 0 189 12 2.2 4.1
2014-05-10 18:45:00+00:00 1021 11.4 0 84 451 0 193 14 2.0 3.8
2014-05-10 18:30:00+00:00 1021 11.3 0 83 420 0 202 15 2.0 4.0

5 rows × 10 columns


In [7]:
# Time series plot.
# "wind_speed" is currently mispelled; that's in pyoos, and can be fixed easily
obsprop_name = 'wind_sped'
obsprop = obsprops_bystdname_dict[obsprop_name]
sta_0df[obsprop_name].plot()
ylabel(obsprop_name + ' ('+obsprop['unit']+')');


Now the same thing, but with lots of exploration in between


In [8]:
# pyoos NERRS collector
nerrsData = NerrsSoap()

May 10: Not sure if this will work, b/c the access token is passed via the collect method, so it hasn't been passed here yet!


In [9]:
# Get all Padilla Bay stations (pdb)
[featureid for featureid in nerrsData.list_features() if featureid.startswith('pdb')]


Out[9]:
['pdbbpnut',
 'pdbbpwq',
 'pdbbynut',
 'pdbbywq',
 'pdbgdnut',
 'pdbgsnut',
 'pdbgswq',
 'pdbjenut',
 'pdbjewq',
 'pdbjlnut',
 'pdbjlwq',
 'pdbnnwq',
 'pdbpfmet']

In [10]:
# Access pdbpfmet station, for the last 7 days (roughly)
nerrsData.filter(features=['pdbpfmet'],
                  start=datetime.utcnow() - timedelta(days=7),
                  end=datetime.utcnow()  - timedelta(hours=12))
#nerrsData.filter(variables=["ATemp"])
response = nerrsData.collect()

In [11]:
# The raw response (a string) is not used outside this cell. The collect method response is what's used
# I'm showing the raw response here, just for reference
raw = nerrsData.raw()
type(raw), raw.keys()


Out[11]:
(dict, ['pdbpfmet'])

In [12]:
# response.elements is a one-element array with a paegan.cdm.dsg.features.station.Station element
response.elements


Out[12]:
[<paegan.cdm.dsg.features.station.Station at 0x4848a50>]

In [13]:
# Looks like the station in the response doesn't include any info about the Reserve it belongs to. Too bad.
# Or is one of the pieces of information below the NERRS site?
sta = response.elements[0]
sta.__dict__.keys()


Out[13]:
['_type',
 '_location',
 '_description',
 '_name',
 '_uid',
 '_elements',
 '_properties']

In [14]:
sta.uid, sta.name, sta.description, sta.type, sta.location, sta.properties


Out[14]:
('pdbpfmet',
 'Padilla Bay Farm',
 None,
 'timeSeries',
 <shapely.geometry.point.Point at 0x4848150>,
 <bound method Station.properties of <paegan.cdm.dsg.features.station.Station object at 0x4848a50>>)

In [15]:
# 'siteid' and 'location_description' seem to refer to the NERRS reserve/site
sta.get_property('siteid'), sta._properties


Out[15]:
('pdb',
 {'horizontal_crs': 'EPSG:4326',
  'location_description': 'Padilla Bay',
  'siteid': 'pdb',
  'state': 'wa',
  'vertical_crs': 'EPSG:4297',
  'vertical_units': 'm'})

In [16]:
staloc = sta.get_location()
print staloc, '||', staloc.type, '||', staloc.xy


POINT Z (122.469303 48.463847 0) || Point || (array('d', [122.469303]), array('d', [48.463847]))

In [17]:
obsprops_bystdname_dict = obsprops_bystdname(sta)
obsprops_bystdname_dict['wind_sped']


Out[17]:
{'description': 'WSpd', 'name': 'WSpd', 'standard': 'wind_sped', 'unit': 'm/s'}

In [18]:
# The individual observations are returned in the station "elements"
stael = sta.elements
type(stael), len(stael)


Out[18]:
(list, 656)

In [19]:
stael0 = stael[0]
stael0.time
# See sta.get_unique_members(), above
# stael0.get_member_names() returns a list of member names for this station 'element'


Out[19]:
datetime.datetime(2014, 5, 10, 19, 30, tzinfo=<UTC>)

In [20]:
stael0.members


Out[20]:
[{'description': 'ATemp',
  'name': 'ATemp',
  'standard': 'air_temperature',
  'unit': '\xc2\xb0C',
  'value': 12.1},
 {'description': 'BP',
  'name': 'BP',
  'standard': 'air_pressure',
  'unit': 'mb',
  'value': 1021.0},
 {'description': 'CumPrcp',
  'name': 'CumPrcp',
  'standard': 'cumulative_precipitation',
  'unit': 'mm',
  'value': 0.0},
 {'description': 'MaxWSpd',
  'name': 'MaxWSpd',
  'standard': 'wind_speed_of_gust',
  'unit': 'm/s',
  'value': 3.1},
 {'description': 'RH',
  'name': 'RH',
  'standard': 'relative_humidity',
  'unit': '%',
  'value': 81.0},
 {'description': 'SDWDir',
  'name': 'SDWDir',
  'standard': 'wind_direction_standard_deviation',
  'unit': 'sd',
  'value': 22.0},
 {'description': 'TotPAR',
  'name': 'TotPAR',
  'standard': 'total_par_LiCor',
  'unit': 'mmoles/m^2',
  'value': 651.0},
 {'description': 'TotPrcp',
  'name': 'TotPrcp',
  'standard': 'total_precipitation',
  'unit': 'mm',
  'value': 0.0},
 {'description': 'Wdir',
  'name': 'Wdir',
  'standard': 'wind_direction_from_true_north',
  'unit': '\xc2\xb0TN',
  'value': 212.0},
 {'description': 'WSpd',
  'name': 'WSpd',
  'standard': 'wind_sped',
  'unit': 'm/s',
  'value': 1.8}]

In [21]:
# From paegan: flatten Returns a Generator of Points that are part of this collection
# Just exploring what this does...
response.flatten


Out[21]:
<bound method StationCollection.flatten of <paegan.cdm.dsg.collections.station_collection.StationCollection object at 0x4848f90>>

In [22]:
# FROM pyoos SOS handling
# For first (and only) station
flattenedsta_0 = map(flatten_element, sta.elements)
sta_0df = pd.DataFrame.from_records(flattenedsta_0, index=['time'])
sta_0df.head()


Out[22]:
air_pressure air_temperature cumulative_precipitation relative_humidity total_par_LiCor total_precipitation wind_direction_from_true_north wind_direction_standard_deviation wind_sped wind_speed_of_gust
time
2014-05-10 19:30:00+00:00 1021 12.1 0 81 651 0 212 22 1.8 3.1
2014-05-10 19:15:00+00:00 1021 11.6 0 83 462 0 192 17 2.0 3.7
2014-05-10 19:00:00+00:00 1021 11.4 0 83 394 0 189 12 2.2 4.1
2014-05-10 18:45:00+00:00 1021 11.4 0 84 451 0 193 14 2.0 3.8
2014-05-10 18:30:00+00:00 1021 11.3 0 83 420 0 202 15 2.0 4.0

5 rows × 10 columns


In [23]:
# Time series plot.
# "wind_speed" is currently mispelled; that's in pyoos, and can be fixed easily
obsprop_name = 'wind_sped'
obsprop = obsprops_bystdname_dict[obsprop_name]
sta_0df[obsprop_name].plot()
ylabel(obsprop_name + ' ('+obsprop['unit']+')');


The block below is Dan's code, with some tweaks I had made. I'm no longer using it directly

Change this to return arrays and/or pandas data frames that pull out individual time series per variable
SEE THE CELL ABOVE THIS ONE!

for obsRec in response: for stationRec in response.get_elements():

  #stationRec = obsRec.feature
  print "**** Station: %s Location: %s" % (stationRec.name, stationRec.get_location())
  #The elements are a list of the observed_properties returned wrapped in a Point object.
  for obsProp in stationRec.get_elements():
    print "  -------------------"
    print "  - Observation Date/Time: %s" % (obsProp.get_time())
    #print "Member names: %s" % (obsProp.get_member_names())
    #I think that for a multi sensor request, there should be multiple members, each representing
    #a specific observed_property.
    for member in obsProp.get_members():
        #Apparently you're going to have to know how each collector parses the pieces of the data.
        #For an SOS query, there appear to be: name, units, value, and standard(CF MMI link).
        #print "    ------\n    member.keys() = %s" % member.keys()
        # member.keys() = ['value', 'description', 'name', 'unit', 'standard']
        m = member
        member_values_tup = (m['name'], m['description'], m['standard'], m['value'], m['unit'])
        print "name: %s (description=%s, standard=%s); value=%s, unit=%s" % member_values_tup
        #for key,value in member.iteritems():
        #    print "    %s = %s" % (key, value)