In [1]:
from utilities import css_styles
css_styles()


Out[1]:

IOOS System Test - Theme 1 - Scenario E - Description/Discussion

Exploring Salinity Data in Sensors and Satellite data

Questions

  1. Can we discover, access, and overlay salinity information in sensors?
  2. Can we discover, access, and overlay salinity information from models?
  3. Is data from different sensors and satellite data (or models) directly comparable? Same units? Same scales?
  4. If not, how much work is necessary to aggregate these streams?
  5. Is metadata for these data intelligable?

Q1 - Can we discover, access, and overlay salinity information?


In [2]:
from pylab import *
from owslib.csw import CatalogueServiceWeb
from owslib import fes
import random
import netCDF4
import pandas as pd
import datetime as dt
from pyoos.collectors.coops.coops_sos import CoopsSos
import cStringIO
import iris
import urllib2
import parser
from lxml import etree

import numpy as np

#generated for csw interface
#from fes_date_filter_formatter import fes_date_filter  #date formatter (R.Signell)
import requests              #required for the processing of requests
from utilities import * 

from IPython.display import HTML
import folium #required for leaflet mapping
import calendar #used to get number of days in a month and year

Define space and time constraints

Kachemak Bay, because it will narrow the result set, and temperature and salinity are key variables in defining thresholds for harmful algal blooms (HABs) leading to paralytic Shellfish Poisoning (PSP) warnings.


In [3]:
#bounding box of interest,[bottom right[lon,lat], top left[lon,lat]]
bounding_box_type = "box"
bounding_box = [[-152.0,59.25],[-150.6,60.00]]

Atlantic Coast, for kicks.


In [4]:
#temporal range
#I'm just interested in this year
start_date = dt.datetime(2014,1,1).strftime('%Y-%m-%d %H:00')
end_date = dt.datetime(2014,7,13).strftime('%Y-%m-%d %H:00')
time_date_range = [start_date,end_date]  #start_date_end_date

print bounding_box
print start_date,'to',end_date


[[-152.0, 59.25], [-150.6, 60.0]]
2014-01-01 00:00 to 2014-07-13 00:00

Define the web-service endpoints to check


In [5]:
endpoint = 'http://www.ngdc.noaa.gov/geoportal/csw' # NGDC Geoportal

csw = CatalogueServiceWeb(endpoint,timeout=60)

for oper in csw.operations:
    if oper.name == 'GetRecords':
        #print '\nISO Queryables:\n',oper.constraints['SupportedISOQueryables']['values']
        pass

Define what possible variables we're looking for using CF standard names.


In [6]:
#put the names in a dict for ease of access
#I'm skipping cox, knudsen, preformed, and reference salinity here
data_dict = {}
data_dict["salinity"] = {"names":['salinity',
                                 'sea_surface_salinity',
                                 'sea_water_absolute_salinity',
                                 'sea_water_practical_salinity',
                                 'sea_water_salinity'],
                                 
                      "sos_name":["salinity"]}

Set up OWSlib and it's FES filter capabilities. This puts our bounding box and data_dict into a form that OWSLib can use to hit our OGC web-service endpoints.


In [7]:
def fes_date_filter(start_date='1900-01-01',stop_date='2100-01-01',constraint='overlaps'):
    if constraint == 'overlaps':
        start = fes.PropertyIsLessThanOrEqualTo(propertyname='apiso:TempExtent_begin', literal=stop_date)
        stop = fes.PropertyIsGreaterThanOrEqualTo(propertyname='apiso:TempExtent_end', literal=start_date)
    elif constraint == 'within':
        start = fes.PropertyIsGreaterThanOrEqualTo(propertyname='apiso:TempExtent_begin', literal=start_date)
        stop = fes.PropertyIsLessThanOrEqualTo(propertyname='apiso:TempExtent_end', literal=stop_date)
    return start,stop

In [8]:
# convert User Input into FES filters
start,stop = fes_date_filter(start_date,end_date)
box = []
box.append(bounding_box[0][0])
box.append(bounding_box[0][1])
box.append(bounding_box[1][0])
box.append(bounding_box[1][1])
bbox = fes.BBox(box)

#use the search name to create search filter
or_filt = fes.Or([fes.PropertyIsLike(propertyname='apiso:AnyText',literal=('*%s*' % val),
                    escapeChar='\\',wildCard='*',singleChar='?') for val in data_dict["salinity"]["names"]])
#not sure if I need this or not
val = 'Averages'
not_filt = fes.Not([fes.PropertyIsLike(propertyname='apiso:AnyText',literal=('*%s*' % val),
                        escapeChar='\\',wildCard='*',singleChar='?')])

In [9]:
filter_list = [fes.And([bbox, start, stop, or_filt, not_filt]) ]
# connect to CSW, explore it's properties
# try request using multiple filters "and" syntax: [[filter1,filter2]]
csw.getrecords2(constraints=filter_list,maxrecords=1000,esn='full')

In [10]:
def service_urls(records,service_string='urn:x-esri:specification:ServiceType:odp:url'):
    """
    extract service_urls of a specific type (DAP, SOS) from records
    """
    urls=[]
    for key,rec in records.iteritems():        
        #create a generator object, and iterate through it until the match is found
        #if not found, gets the default value (here "none")
        url = next((d['url'] for d in rec.references if d['scheme'] == service_string), None)
        if url is not None:
            urls.append(url)
    return urls

What's in the result set?


In [11]:
#print records that are available
print endpoint
print "number of datasets available: ",len(csw.records.keys())
csw.records.keys()


http://www.ngdc.noaa.gov/geoportal/csw
number of datasets available:  4
Out[11]:
['National Data Buoy Center SOS',
 'Aquarius_V3_SSS_Daily',
 'Aquarius_V3_SSS_Monthly',
 'Aquarius_V3_SSS_Weekly']

In [ ]: