In [1]:
from utilities import css_styles
css_styles()


Out[1]:

Accessing NERRS dissolved oxygen station data with Pyoos

This notebook is largely based on Emilio Mayorga's excellent pyoos_nerrs_pandas_demo1/ipynb. It's currently very weak on the discovery phase (I knew the prefix code for the area I wanted, and even then more station codes were returned than actually are returning data.) But it tweaks Emilio's code to plot dissolved oxygen data from NERRS stations.

It will not work unless your IP has been registered by the NERRS/CDMO office, for which you need to contact them.


In [2]:
from datetime import datetime, timedelta
import pandas as pd
from pyoos.collectors.nerrs.nerrs_soap import NerrsSoap

In [3]:
# FROM pyoos SOS handling
# Convenience function to build record style time series representation
def flatten_element(p):
    rd = {'time':p.time}
    for m in p.members:
        rd[m['standard']] = m['value']
    return rd

# sta.get_unique_members() serves the same function as the pyoos SOS get_unique_members method
# Convenience function to extract a dict of unique members (observed properties) by standard name
obsprops_bystdname = lambda sta: {m['standard']:m for m in sta.get_unique_members()}

In [4]:
# pyoos NERRS collector
nerrsData = NerrsSoap()

In [5]:
# Get all the Alaska stations (ka)
stations = [featureid for featureid in nerrsData.list_features()
 if featureid.startswith('kac') and featureid.endswith('wq')]

for i in range(0, len(stations)):
    print '[' + str(i) + ']: ' + stations[i]


[0]: kacbcwq
[1]: kacdlwq
[2]: kach3wq
[3]: kachdwq
[4]: kachowq
[5]: kachswq
[6]: kacpgwq
[7]: kacsdwq
[8]: kacsewq
[9]: kacsswq

I'm not sure how to determine the ones that have data via this method, maybe Kyle can help here? I know from the CDMO that only Seldovia Deep - KACSDWQ and Homer Dolphin Deep - KACHDWQ actually have DO data.


In [6]:
# Access pdbpfmet station, for the last 7 days (roughly)
nerrsData.filter(features=[stations[3]],
                  start=datetime.utcnow() - timedelta(days=7),
                  end=datetime.utcnow()  - timedelta(hours=12))
#nerrsData.filter(variables=["ATemp"])
response = nerrsData.collect()

In [7]:
# The raw response (a string) is not used outside this cell. The collect method response is what's used
# I'm showing the raw response here, just for reference
raw = nerrsData.raw()
type(raw), raw.keys()


Out[7]:
(dict, ['kachdwq'])

In [8]:
# response.elements is a one-element array with a paegan.cdm.dsg.features.station.Station element
response.elements


Out[8]:
[<paegan.cdm.dsg.features.station.Station at 0x7f5265081fd0>]

In [9]:
# Looks like the station in the response doesn't include any info about the Reserve it belongs to. Too bad.
# Or is one of the pieces of information below the NERRS site?
sta = response.elements[0]
sta.__dict__.keys()


Out[9]:
['_type',
 '_location',
 '_description',
 '_name',
 '_uid',
 '_elements',
 '_properties']

In [10]:
sta.uid, sta.name, sta.description, sta.type, sta.location, sta.properties


Out[10]:
('kachdwq',
 'Homer Dolphin Deep',
 None,
 'timeSeries',
 <shapely.geometry.point.Point at 0x7f5264ad38d0>,
 <bound method Station.properties of <paegan.cdm.dsg.features.station.Station object at 0x7f5265081fd0>>)

In [11]:
# 'siteid' and 'location_description' seem to refer to the NERRS reserve/site
sta.get_property('siteid'), sta._properties


Out[11]:
('kac',
 {'horizontal_crs': 'EPSG:4326',
  'location_description': 'Kachemak Bay',
  'siteid': 'kac',
  'state': 'ak',
  'vertical_crs': 'EPSG:4297',
  'vertical_units': 'm'})

In [12]:
staloc = sta.get_location()
print staloc, '||', staloc.type, '||', staloc.xy


POINT Z (151.40878 59.60201 0) || Point || (array('d', [151.40878]), array('d', [59.60201]))

In [13]:
obsprops_bystdname_dict = obsprops_bystdname(sta)
obsprops_bystdname_dict['oxygen_concentration_in_sea_water']


Out[13]:
{'description': 'DO_mgl',
 'name': 'DO_mgl',
 'standard': 'oxygen_concentration_in_sea_water',
 'unit': 'mg/L'}

In [14]:
# The individual observations are returned in the station "elements"
stael = sta.elements
type(stael), len(stael)


Out[14]:
(list, 606)

In [15]:
stael0 = stael[0]
stael0.time
# See sta.get_unique_members(), above
# stael0.get_member_names() returns a list of member names for this station 'element'


Out[15]:
datetime.datetime(2014, 8, 6, 23, 15, tzinfo=<UTC>)

In [16]:
stael0.members


Out[16]:
[{'description': 'Depth',
  'name': 'Depth',
  'standard': None,
  'unit': None,
  'value': 11.94},
 {'description': 'DO_mgl',
  'name': 'DO_mgl',
  'standard': 'oxygen_concentration_in_sea_water',
  'unit': 'mg/L',
  'value': 10.2},
 {'description': 'DO_pct',
  'name': 'DO_pct',
  'standard': 'oxygen_concentration_in_sea_water',
  'unit': '%',
  'value': 111.3},
 {'description': 'pH',
  'name': 'pH',
  'standard': 'sea_water_acidity',
  'unit': '',
  'value': 8.1},
 {'description': 'Sal',
  'name': 'Sal',
  'standard': 'sea_water_salinity',
  'unit': 'ppt',
  'value': 31.3},
 {'description': 'SpCond',
  'name': 'SpCond',
  'standard': 'specific_conductance',
  'unit': 'mS/cm',
  'value': 48.29},
 {'description': 'Temp',
  'name': 'Temp',
  'standard': 'sea_water_temperature',
  'unit': '\xc2\xb0C',
  'value': 10.4},
 {'description': 'Turb',
  'name': 'Turb',
  'standard': 'sea_water_turbidity',
  'unit': 'NTU',
  'value': 2.0}]

In [17]:
# From paegan: flatten Returns a Generator of Points that are part of this collection
# Just exploring what this does...
response.flatten


Out[17]:
<bound method StationCollection.flatten of <paegan.cdm.dsg.collections.station_collection.StationCollection object at 0x7f5264ad3950>>

In [18]:
# FROM pyoos SOS handling
# For first (and only) station
flattenedsta_0 = map(flatten_element, sta.elements)
sta_0df = pd.DataFrame.from_records(flattenedsta_0, index=['time'])
sta_0df.head()


Out[18]:
None oxygen_concentration_in_sea_water sea_water_acidity sea_water_salinity sea_water_temperature sea_water_turbidity specific_conductance
time
2014-08-06 23:15:00+00:00 11.94 111.3 8.1 31.3 10.4 2 48.29
2014-08-06 23:00:00+00:00 12.17 107.5 8.0 31.4 10.2 2 48.45
2014-08-06 22:45:00+00:00 12.34 98.5 8.0 31.6 10.0 1 48.66
2014-08-06 22:30:00+00:00 12.52 90.7 7.9 31.7 9.8 1 48.88
2014-08-06 21:15:00+00:00 13.29 89.9 7.9 31.8 9.6 1 48.96

In [19]:
# Time series plot.
obsprop_name = 'oxygen_concentration_in_sea_water'
obsprop = obsprops_bystdname_dict[obsprop_name]
sta_0df[obsprop_name].plot()
ylabel(obsprop_name + ' ('+obsprop['unit']+')');



In [ ]: