CSW Search on bounding box, time extent and anyText

IOOS System Test: Extreme Events Theme: Inundation

Compare modeled water levels with observations for a specified bounding box and time period using IOOS recommended service standards for catalog search (CSW) and data retrieval (OPeNDAP & SOS).

Query CSW to find datasets that match criteria
Extract OPeNDAP data endpoints from model datasets and SOS endpoints from observational datasets
OPeNDAP model datasets will be granules
SOS endpoints may be datasets (from ncSOS) or collections of datasets (from NDBC, CO-OPS SOS servers)
Filter SOS services to obtain datasets
Extract data from SOS datasets
Extract data from model datasets at locations of observations
Compare time series data on same vertical datum



In [1]:

    
from pylab import *
from owslib.csw import CatalogueServiceWeb
from owslib import fes
import random
import netCDF4
import pandas as pd
import datetime as dt
from pyoos.collectors.coops.coops_sos import CoopsSos
import cStringIO
import iris
import urllib2
import parser
from lxml import etree

Specify a bounding box and time range of interest:



In [149]:

    
#box=[-74.4751, 40.3890, -73.7432, 40.9397]
box=[-76.4751, 38.3890, -71.7432, 42.9397]
box=[-75., 39., -71., 41.5]
box=[39., -75., 41.5, -71.0]
#box=[-180, -90, 180, 90]

# specific specific times (UTC) ...

# hurricane sandy
jd_start = dt.datetime(2012,10,26)
jd_stop = dt.datetime(2012,11,2)

# 2014 feb 10-15 storm
jd_start = dt.datetime(2014,2,10)
jd_stop = dt.datetime(2014,2,15)

# 2014 recent
jd_start = dt.datetime(2014,3,8)
jd_stop = dt.datetime(2014,3,11)

# 2011 
#jd_start = dt.datetime(2013,4,20)
#jd_stop = dt.datetime(2013,4,24)

# ... or relative to now
jd_start = dt.datetime.utcnow()- dt.timedelta(days=0)
jd_stop = dt.datetime.utcnow() + dt.timedelta(days=0)

# round to nearest minute
start_date = jd_start.strftime('%Y-%m-%dT%H:%M:00Z')
stop_date  = jd_stop.strftime('%Y-%m-%dT%H:%M:00Z')

jd_start = dt.datetime.strptime(start_date,'%Y-%m-%dT%H:%M:00Z')
jd_stop = dt.datetime.strptime(stop_date,'%Y-%m-%dT%H:%M:00Z')

'''
start_date = jd_start.strftime('%Y-%m-%d %H:00')
stop_date  = jd_stop.strftime('%Y-%m-%d %H:00')

jd_start = dt.datetime.strptime(start_date,'%Y-%m-%d %H:%M')
jd_stop = dt.datetime.strptime(stop_date,'%Y-%m-%d %H:%M')
'''
print start_date,'to',stop_date

sos_name = 'water_surface_height_above_reference_datum'









    



2014-12-01T18:03:00Z to 2014-12-01T18:03:00Z

Now we need to specify all the names we know for water level, names that will get used in the CSW search, and also to find data in the datasets that are returned. This is ugly and fragile. There hopefully will be a better way in the future...



In [150]:

    
std_name_list=['sea_water_salinity','foobar_salinity']

Search CSW for datasets of interest



In [151]:

    
#from IPython.core.display import HTML
#HTML('<iframe src=http://www.ngdc.noaa.gov/geoportal/ width=950 height=400></iframe>')



In [152]:

    
# connect to CSW, explore it's properties

endpoint = 'http://www.ngdc.noaa.gov/geoportal/csw' # NGDC Geoportal
#endpoint = 'http://catalog.data.gov/csw-all' #  catalog.data.gov CSW
#endpoint = 'http://catalog.data.gov/csw' #  catalog.data.gov CSW
#endpoint = 'http://www.nodc.noaa.gov/geoportal/csw'   # NODC Geoportal: granule level
#endpoint = 'http://data.nodc.noaa.gov/geoportal/csw'  # NODC Geoportal: collection level   

#endpoint = 'http://geoport.whoi.edu/geoportal/csw'  # USGS WHSC Geoportal

#endpoint = 'http://www.nodc.noaa.gov/geoportal/csw'   # NODC Geoportal: granule level
#endpoint = 'http://data.nodc.noaa.gov/geoportal/csw'  # NODC Geoportal: collection level   
#endpoint = 'http://geodiscover.cgdi.ca/wes/serviceManagerCSW/csw'  # NRCAN CUSTOM
#endpoint = 'http://geoport.whoi.edu/gi-cat/services/cswiso' # USGS Woods Hole GI_CAT
#endpoint = 'http://cida.usgs.gov/gdp/geonetwork/srv/en/csw' # USGS CIDA Geonetwork
#endpoint = 'http://cmgds.marine.usgs.gov/geonetwork/srv/en/csw' # USGS Coastal and Marine Program
#endpoint = 'http://geoport.whoi.edu/geoportal/csw' # USGS Woods Hole Geoportal 
#endpoint = 'http://geo.gov.ckan.org/csw'  # CKAN testing site for new Data.gov
#endpoint = 'https://edg.epa.gov/metadata/csw'  # EPA
#endpoint = 'http://cwic.csiss.gmu.edu/cwicv1/discovery'  # CWIC

csw = CatalogueServiceWeb(endpoint,timeout=60)
csw.version









    Out[152]:





'2.0.2'



In [153]:

    
csw.get_operation_by_name('GetRecords').constraints









    Out[153]:





[Constraint: SupportedCommonQueryables - ['Subject', 'Title', 'Abstract', 'AnyText', 'Format', 'Identifier', 'Modified', 'Type', 'BoundingBox'],
 Constraint: SupportedISOQueryables - ['apiso:Subject', 'apiso:Title', 'apiso:Abstract', 'apiso:AnyText', 'apiso:Format', 'apiso:Identifier', 'apiso:Modified', 'apiso:Type', 'apiso:BoundingBox', 'apiso:CRS.Authority', 'apiso:CRS.ID', 'apiso:CRS.Version', 'apiso:RevisionDate', 'apiso:AlternateTitle', 'apiso:CreationDate', 'apiso:PublicationDate', 'apiso:OrganizationName', 'apiso:HasSecurityConstraints', 'apiso:Language', 'apiso:ResourceIdentifier', 'apiso:ParentIdentifier', 'apiso:KeywordType', 'apiso:TopicCategory', 'apiso:ResourceLanguage', 'apiso:GeographicDescriptionCode', 'apiso:Denominator', 'apiso:DistanceValue', 'apiso:DistanceUOM', 'apiso:TempExtent_begin', 'apiso:TempExtent_end', 'apiso:ServiceType', 'apiso:ServiceTypeVersion', 'apiso:Operation', 'apiso:OperatesOn', 'apiso:OperatesOnIdentifier', 'apiso:OperatesOnName', 'apiso:CouplingType'],
 Constraint: AdditionalQueryables - ['apiso:Degree', 'apiso:AccessConstraints', 'apiso:OtherConstraints', 'apiso:Classification', 'apiso:ConditionApplyingToAccessAndUse', 'apiso:Lineage', 'apiso:ResponsiblePartyRole', 'apiso:ResponsiblePartyName', 'apiso:SpecificationTitle', 'apiso:SpecificationDate', 'apiso:SpecificationDateType']]



In [154]:

    
# hopefully something like this will be implemented in fes soon
def dateRange(start_date='1900-01-01',stop_date='2100-01-01',constraint='overlaps'):
    if constraint == 'overlaps':
        start = fes.PropertyIsLessThanOrEqualTo(propertyname='apiso:TempExtent_begin', literal=stop_date)
        stop = fes.PropertyIsGreaterThanOrEqualTo(propertyname='apiso:TempExtent_end', literal=start_date)
    elif constraint == 'within':
        start = fes.PropertyIsGreaterThanOrEqualTo(propertyname='apiso:TempExtent_begin', literal=start_date)
        stop = fes.PropertyIsLessThanOrEqualTo(propertyname='apiso:TempExtent_end', literal=stop_date)
    return start,stop



In [155]:

    
stop_date









    Out[155]:





'2014-12-01T18:03:00Z'



In [156]:

    
# convert User Input into FES filters
start,stop = dateRange(start_date,stop_date)
bbox = fes.BBox(box)



In [157]:

    
or_filt = fes.Or([fes.PropertyIsLike(propertyname='apiso:AnyText',literal=('*%s*' % val),
                    escapeChar='\\',wildCard='*',singleChar='?') for val in std_name_list])



In [158]:

    
val = 'sea_water_salinity'
any_filt = fes.PropertyIsLike(propertyname='apiso:AnyText',literal=('*%s*' % val),
                    escapeChar='\\',wildCard='*',singleChar='?')



In [159]:

    
val = 'Averages'
not_filt = fes.Not([fes.PropertyIsLike(propertyname='apiso:AnyText',literal=('*%s*' % val),
                        escapeChar='\\',wildCard='*',singleChar='?')])



In [160]:

    
filter_list = [fes.And([ bbox, start, stop, any_filt]) ]



In [161]:

    
# try request using multiple filters "and" syntax: [[filter1,filter2]]
csw.getrecords2(constraints=filter_list,maxrecords=1000,esn='full')
print len(csw.records.keys())



In [162]:

    
csw.request









    Out[162]:





u'<?xml version="1.0" encoding="utf-8" standalone="no"?>\n<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ogc="http://www.opengis.net/ogc" xmlns:gml="http://www.opengis.net/gml" outputSchema="http://www.opengis.net/cat/csw/2.0.2" outputFormat="application/xml" version="2.0.2" service="CSW" resultType="results" maxRecords="1000" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd"><csw:Query typeNames="csw:Record"><csw:ElementSetName>full</csw:ElementSetName><csw:Constraint version="1.1.0"><ogc:Filter><ogc:And><ogc:BBOX><ogc:PropertyName>ows:BoundingBox</ogc:PropertyName><gml:Envelope><gml:lowerCorner>39.0 -75.0</gml:lowerCorner><gml:upperCorner>41.5 -71.0</gml:upperCorner></gml:Envelope></ogc:BBOX><ogc:PropertyIsLessThanOrEqualTo><ogc:PropertyName>apiso:TempExtent_begin</ogc:PropertyName><ogc:Literal>2014-12-01T18:03:00Z</ogc:Literal></ogc:PropertyIsLessThanOrEqualTo><ogc:PropertyIsGreaterThanOrEqualTo><ogc:PropertyName>apiso:TempExtent_end</ogc:PropertyName><ogc:Literal>2014-12-01T18:03:00Z</ogc:Literal></ogc:PropertyIsGreaterThanOrEqualTo><ogc:PropertyIsLike wildCard="*" singleChar="?" escapeChar="\\"><ogc:PropertyName>apiso:AnyText</ogc:PropertyName><ogc:Literal>*sea_water_salinity*</ogc:Literal></ogc:PropertyIsLike></ogc:And></ogc:Filter></csw:Constraint></csw:Query></csw:GetRecords>'

Now print out some titles



In [127]:

    
for rec,item in csw.records.iteritems():
    print item.title









    



HYbrid Coordinate Ocean Model (HYCOM): Global
NECOFS GOM3 (FVCOM) - Northeast US - Latest Forecast
ROMS ESPRESSO Real-Time Operational IS4DVAR Forecast System Version 2 (NEW) 2013-present FMRC History (Best)
DBOFS - Delaware Bay Operational Forecast System - NOAA CO-OPS - POM
CBOFS - Chesapeake Bay Operational Forecast System - NOAA CO-OPS - POM
NECOFS Massachusetts (FVCOM) - Massachusetts Coastal - Latest Forecast

Define a function that will return the endpoint for a specified service type



In [48]:

    
def service_urls(records,service_string='urn:x-esri:specification:ServiceType:odp:url'):
    """
    extract service_urls of a specific type (DAP, SOS) from records
    """
    urls=[]
    for key,rec in records.iteritems():
        #create a generator object, and iterate through it until the match is found
        #if not found, gets the default value (here "none")
        url = next((d['url'] for d in rec.references if d['scheme'] == service_string), None)
        if url is not None:
            urls.append(url)
    return urls



In [50]:

    
choice=random.choice(list(csw.records.keys()))
print choice
csw.records[choice].references









    



7a31df5f-679c-461f-a2c2-77b3293f6e55






    Out[50]:





[{'scheme': 'None',
  'url': 'http://www.nodc.noaa.gov/cgi-bin/OAS/prd/accession/download/9200120'},
 {'scheme': 'None',
  'url': 'ftp://ftp.nodc.noaa.gov/nodc/archive/arc0001/9200120/1.1/'}]

Print out all the OPeNDAP Data URL endpoints



In [40]:

    
dap_urls = service_urls(csw.records,service_string='urn:x-esri:specification:ServiceType:odp:url')
print "\n".join(dap_urls)

Print out all the SOS Data URL endpoints



In [17]:

    
sos_urls = service_urls(csw.records,service_string='urn:x-esri:specification:ServiceType:sos:url')
print "\n".join(sos_urls)



In [18]:

    
def nearxy(x,y,xi,yi):
    """
    find the indices x[i] of arrays (x,y) closest to the points (xi,yi)
    """
    ind=ones(len(xi),dtype=int)
    dd=ones(len(xi),dtype='float')
    for i in arange(len(xi)):
        dist=sqrt((x-xi[i])**2+(y-yi[i])**2)
        ind[i]=dist.argmin()
        dd[i]=dist[ind[i]]
    return ind,dd



In [19]:

    
def find_ij(x,y,d,xi,yi):
    """
    find non-NaN cell d[j,i] that are closest to points (xi,yi).
    """
    index = where(~isnan(d.flatten()))[0]
    ind,dd = nearxy(x.flatten()[index],y.flatten()[index],xi,yi)
    j,i=ind2ij(x,index[ind])
    return i,j,dd



In [20]:

    
def find_timevar(cube):
    """
    return the time variable from Iris. This is a workaround for
    Iris having problems with FMRC aggregations, which produce two time coordinates
    """
    try:
        cube.coord(axis='T').rename('time')
    except:
        pass
    timevar = cube.coord('time')
    return timevar



In [21]:

    
def ind2ij(a,index):
    """
    returns a[j,i] for a.ravel()[index]
    """
    n,m = shape(lon)
    j = ceil(index/m).astype(int)
    i = remainder(index,m)
    return i,j

1. Get observations from SOS



In [22]:

    
collector = CoopsSos()



In [23]:

    
collector.server.identification.title









    Out[23]:





'NOAA.NOS.CO-OPS SOS'



In [24]:

    
collector.start_time = jd_start
collector.end_time = jd_stop
collector.variables = [sos_name]



In [25]:

    
ofrs = collector.server.offerings



In [26]:

    
print len(ofrs)
for p in ofrs[700:710]: print p









    



1018
Offering id: station-8447387, name: urn:ioos:station:NOAA.NOS.CO-OPS:8447387
Offering id: station-8447435, name: urn:ioos:station:NOAA.NOS.CO-OPS:8447435
Offering id: station-8447930, name: urn:ioos:station:NOAA.NOS.CO-OPS:8447930
Offering id: station-8449130, name: urn:ioos:station:NOAA.NOS.CO-OPS:8449130
Offering id: station-8452660, name: urn:ioos:station:NOAA.NOS.CO-OPS:8452660
Offering id: station-8452944, name: urn:ioos:station:NOAA.NOS.CO-OPS:8452944
Offering id: station-8452951, name: urn:ioos:station:NOAA.NOS.CO-OPS:8452951
Offering id: station-8454000, name: urn:ioos:station:NOAA.NOS.CO-OPS:8454000
Offering id: station-8454049, name: urn:ioos:station:NOAA.NOS.CO-OPS:8454049
Offering id: station-8461490, name: urn:ioos:station:NOAA.NOS.CO-OPS:8461490

Find the SOS stations within our bounding box and time extent

We would like to just use a filter on a collection to get a new collection, but PYOOS doesn't do that yet. So we do a GetObservation request for a collection, including a bounding box, and asking for one value at the start of the time period of interest. We use that to do a bounding box filter on the SOS server, which returns 1 point for each station found. So for 3 stations, we get back 3 records, in CSV format. We can strip the station ids from the CSV, and then we have a list of stations we can use with pyoos. The template for the GetObservation query for the bounding box filtered collection was generated using the GUI at http://opendap.co-ops.nos.noaa.gov/ioos-dif-sos/



In [27]:

    
iso_start = jd_start.strftime('%Y-%m-%dT%H:%M:%SZ')
print iso_start
box_str=','.join(str(e) for e in box)
print box_str









    



2014-03-21T15:00:00Z
-76.4751,38.389,-71.7432,42.9397



In [28]:

    
url=('http://opendap.co-ops.nos.noaa.gov/ioos-dif-sos/SOS?'
     'service=SOS&request=GetObservation&version=1.0.0&'
     'observedProperty=%s&offering=urn:ioos:network:NOAA.NOS.CO-OPS:WaterLevelActive&'
     'featureOfInterest=BBOX:%s&responseFormat=text/csv&eventTime=%s&'
     'result=VerticalDatum==urn:ogc:def:datum:epsg::5103&'
     'dataType=PreliminarySixMinute') % (sos_name,box_str,iso_start)
print url
obs_loc_df = pd.read_csv(url)









    



http://opendap.co-ops.nos.noaa.gov/ioos-dif-sos/SOS?service=SOS&request=GetObservation&version=1.0.0&observedProperty=water_surface_height_above_reference_datum&offering=urn:ioos:network:NOAA.NOS.CO-OPS:WaterLevelActive&featureOfInterest=BBOX:-76.4751,38.389,-71.7432,42.9397&responseFormat=text/csv&eventTime=2014-03-21T15:00:00Z&result=VerticalDatum==urn:ogc:def:datum:epsg::5103&dataType=PreliminarySixMinute



In [29]:

    
obs_loc_df.head()









    Out[29]:






  
    
      
      station_id
      sensor_id
      latitude (degree)
      longitude (degree)
      date_time
      water_surface_height_above_reference_datum (m)
      datum_id
      vertical_position (m)
    
  
  
    
      0
       urn:ioos:station:NOAA.NOS.CO-OPS:8461490
       urn:ioos:sensor:NOAA.NOS.CO-OPS:8461490:B1
       41.3614
      -72.0900
       2014-03-21T15:00:00Z
      -0.139
       urn:ogc:def:datum:epsg::5103
       1.634
    
    
      1
       urn:ioos:station:NOAA.NOS.CO-OPS:8467150
       urn:ioos:sensor:NOAA.NOS.CO-OPS:8467150:B1
       41.1733
      -73.1817
       2014-03-21T15:00:00Z
      -0.833
       urn:ogc:def:datum:epsg::5103
       1.775
    
    
      2
       urn:ioos:station:NOAA.NOS.CO-OPS:8510560
       urn:ioos:sensor:NOAA.NOS.CO-OPS:8510560:A1
       41.0483
      -71.9600
       2014-03-21T15:00:00Z
      -0.097
       urn:ogc:def:datum:epsg::5103
       1.655
    
    
      3
       urn:ioos:station:NOAA.NOS.CO-OPS:8518750
       urn:ioos:sensor:NOAA.NOS.CO-OPS:8518750:B1
       40.7006
      -74.0142
       2014-03-21T15:00:00Z
       0.282
       urn:ogc:def:datum:epsg::5103
       1.848
    
    
      4
       urn:ioos:station:NOAA.NOS.CO-OPS:8531680
       urn:ioos:sensor:NOAA.NOS.CO-OPS:8531680:B1
       40.4669
      -74.0094
       2014-03-21T15:00:00Z
       0.415
       urn:ogc:def:datum:epsg::5103
       1.624
    
  

5 rows × 8 columns



In [30]:

    
stations = [sta.split(':')[-1] for sta in obs_loc_df['station_id']]
print stations
obs_lon = [sta for sta in obs_loc_df['longitude (degree)']]
obs_lat = [sta for sta in obs_loc_df['latitude (degree)']]
print obs_lon
print obs_lat









    



['8461490', '8467150', '8510560', '8518750', '8531680', '8534720', '8536110', '8545240', '8551910', '8557380', '8571892']
[-72.090000000000003, -73.181700000000006, -71.959999999999994, -74.014200000000002, -74.009399999999999, -74.418300000000002, -74.959999999999994, -75.1417, -75.573300000000003, -75.120000000000005, -76.068299999999994]
[41.361400000000003, 41.173299999999998, 41.048299999999998, 40.700600000000001, 40.466900000000003, 39.354999999999997, 38.968299999999999, 39.933300000000003, 39.558300000000003, 38.781700000000001, 38.573300000000003]

Get longName from SOS DescribeSensor request



In [31]:

    
def get_Coops_longName(sta):
    """
    get longName for specific station from COOPS SOS using DescribeSensor request
    """
    url=('http://opendap.co-ops.nos.noaa.gov/ioos-dif-sos/SOS?service=SOS&'
        'request=DescribeSensor&version=1.0.0&outputFormat=text/xml;subtype="sensorML/1.0.1"&'
        'procedure=urn:ioos:sensor:NOAA.NOS.CO-OPS:%s:B1') % sta
    tree = etree.parse(urllib2.urlopen(url))
    root = tree.getroot()
    longName=root.xpath("//sml:identifier[@name='longName']/sml:Term/sml:value/text()", namespaces={'sml':"http://www.opengis.net/sensorML/1.0.1"})
    return longName

Request CSV response from SOS and convert to Pandas DataFrames



In [32]:

    
def coops2df(collector,coops_id,sos_name):
    collector.features = [coops_id]
    collector.variables = [sos_name]
    response = collector.raw(responseFormat="text/csv")
    data_df = pd.read_csv(cStringIO.StringIO(str(response)), parse_dates=True, index_col='date_time')
    data_df['Observed Data']=data_df['water_surface_height_above_reference_datum (m)']-data_df['vertical_position (m)']
    a = get_Coops_longName(coops_id)
    if len(a)==0:
        long_name=coops_id
    else:
        long_name=a[0]
        
    data_df.name=long_name
    return data_df

Generate a uniform 6-min time base for model/data comparison:



In [33]:

    
ts_rng = pd.date_range(start=jd_start, end=jd_stop, freq='6Min')
ts = pd.DataFrame(index=ts_rng)
print jd_start,jd_stop
print len(ts)









    



2014-03-21 15:00:00 2014-03-27 15:00:00
1441

Create a list of obs dataframes, one for each station:



In [34]:

    
obs_df=[]
for sta in stations:
    b=coops2df(collector,sta,sos_name)
    print b.name
    # limit interpolation to 10 points (10 @ 6min = 1 hour)
    obs_df.append(pd.DataFrame(pd.concat([b, ts],axis=1).interpolate(limit=10)['Observed Data']))
    obs_df[-1].name=b.name









    



New London, CT
Bridgeport, CT
8510560
The Battery, NY
Sandy Hook, NJ
8534720
8536110
Philadelphia, PA
Reedy Point, DE
Lewes, DE
Cambridge, MD



In [35]:

    
sta=0;obs_df[sta].plot(figsize=(12,5),title=obs_df[sta].name);

Get model output from OPeNDAP URLS

Try to open all the OPeNDAP URLS using Iris from the British Met Office. If we can open in Iris, we know it's a model result.

Construct an Iris contraint to load only cubes that match the std_name_list:



In [36]:

    
print std_name_list
def name_in_list(cube):
    return cube.standard_name in std_name_list
constraint = iris.Constraint(cube_func=name_in_list)









    



['water_surface_height_above_reference_datum', 'sea_surface_height_above_geoid', 'sea_surface_elevation', 'sea_surface_height_above_reference_ellipsoid', 'sea_surface_height_above_sea_level', 'sea_surface_height', 'water level']



In [37]:

    
def mod_df(arr,timevar,istart,istop,mod_name,ts):
    """
    return time series (DataFrame) from model interpolated onto uniform time base
    """
    t=timevar.points[istart:istop]
    jd = timevar.units.num2date(t)

    # eliminate any data that is closer together than 10 seconds
    # this was required to handle issues with CO-OPS aggregations, I think because
    # they use floating point time in hours, which is not very accurate, so the FMRC
    # aggregation is aggregating points that actually occur at the same time
    dt =diff(jd)
    s = array([ele.seconds for ele in dt])
    ind=where(s>10)[0]
    arr=arr[ind+1]
    jd=jd[ind+1]
    
    b = pd.DataFrame(arr,index=jd,columns=[mod_name])
    # eliminate any data with NaN
    b = b[isfinite(b[mod_name])]
    # interpolate onto uniform time base, fill gaps up to: (10 values @ 6 min = 1 hour)
    c = pd.concat([b, ts],axis=1).interpolate(limit=10)
    return c



In [38]:

    
for url in dap_urls:
    try:
        a = iris.load_cube(url,constraint)
        # convert to units of meters
        a.convert_units('m')
        # take first 20 chars for model name
        mod_name = a.attributes['title'][0:20]
        r = shape(a)
        timevar = find_timevar(a)
        lat = a.coord(axis='Y').points
        lon = a.coord(axis='X').points
        jd = timevar.units.num2date(timevar.points)
        istart = timevar.nearest_neighbour_index(timevar.units.date2num(jd_start))
        istop = timevar.nearest_neighbour_index(timevar.units.date2num(jd_stop))
        
        # only proceed if we have data in the range requested
        if istart != istop:
            print url
            nsta = len(obs_lon)
            if len(r)==3:
                d = a[0,:,:].data
                # find the closest non-land point from a structured grid model
                j,i,dd = find_ij(lon,lat,d,obs_lon,obs_lat)
                for n in range(nsta):
                    # only use if model cell is within 0.1 degree of requested location
                    if dd[n]<0.1:
                        arr = a[istart:istop,j[n],i[n]].data                    
                        c = mod_df(arr,timevar,istart,istop,mod_name,ts)
                        name= obs_df[n].name
                        obs_df[n]=pd.concat([obs_df[n],c],axis=1)
                        obs_df[n].name = name
            elif len(r)==2:
                # find the closest point from an unstructured grid model
                index,dd = nearxy(lon.flatten(),lat.flatten(),obs_lon,obs_lat)
                for n in range(nsta):
                    # only use if model cell is within 0.1 degree of requested location
                    if dd[n]<0.1:
                        arr = a[istart:istop,index[n]].data
                        c = mod_df(arr,timevar,istart,istop,mod_name,ts)
                        name = obs_df[n].name
                        obs_df[n]=pd.concat([obs_df[n],c],axis=1)
                        obs_df[n].name = name                 
    except:
        pass



In [39]:

    
for df in obs_df:
    df.plot(figsize=(14,6),title=df.name)
    ylabel('m')



In [39]:

	station_id	sensor_id	latitude (degree)	longitude (degree)	date_time	water_surface_height_above_reference_datum (m)	datum_id	vertical_position (m)
0	urn:ioos:station:NOAA.NOS.CO-OPS:8461490	urn:ioos:sensor:NOAA.NOS.CO-OPS:8461490:B1	41.3614	-72.0900	2014-03-21T15:00:00Z	-0.139	urn:ogc:def:datum:epsg::5103	1.634
1	urn:ioos:station:NOAA.NOS.CO-OPS:8467150	urn:ioos:sensor:NOAA.NOS.CO-OPS:8467150:B1	41.1733	-73.1817	2014-03-21T15:00:00Z	-0.833	urn:ogc:def:datum:epsg::5103	1.775
2	urn:ioos:station:NOAA.NOS.CO-OPS:8510560	urn:ioos:sensor:NOAA.NOS.CO-OPS:8510560:A1	41.0483	-71.9600	2014-03-21T15:00:00Z	-0.097	urn:ogc:def:datum:epsg::5103	1.655
3	urn:ioos:station:NOAA.NOS.CO-OPS:8518750	urn:ioos:sensor:NOAA.NOS.CO-OPS:8518750:B1	40.7006	-74.0142	2014-03-21T15:00:00Z	0.282	urn:ogc:def:datum:epsg::5103	1.848
4	urn:ioos:station:NOAA.NOS.CO-OPS:8531680	urn:ioos:sensor:NOAA.NOS.CO-OPS:8531680:B1	40.4669	-74.0094	2014-03-21T15:00:00Z	0.415	urn:ogc:def:datum:epsg::5103	1.624