In [1]:
"""
The original notebook is NGDC_CSW_QueryForIOOSRAs.ipynb
Created by Emilio Mayorga, 2/10/2014
"""
title = 'Catalog based search for the IOOS Regional Associations acronyms'
name = '2015-11-23-NGDC_CSW_QueryForIOOSRAs'
In [2]:
%matplotlib inline
import seaborn
seaborn.set(style='ticks')
import os
from datetime import datetime
from IPython.core.display import HTML
import warnings
warnings.simplefilter("ignore")
# Metadata and markdown generation.
hour = datetime.utcnow().strftime('%H:%M')
comments = "true"
date = '-'.join(name.split('-')[:3])
slug = '-'.join(name.split('-')[3:])
metadata = dict(title=title,
date=date,
hour=hour,
comments=comments,
slug=slug,
name=name)
markdown = """Title: {title}
date: {date} {hour}
comments: {comments}
slug: {slug}
{{% notebook {name}.ipynb cells[2:] %}}
""".format(**metadata)
content = os.path.abspath(os.path.join(os.getcwd(), os.pardir,
os.pardir, '{}.md'.format(name)))
with open('{}'.format(content), 'w') as f:
f.writelines(markdown)
html = """
<small>
<p> This post was written as an IPython notebook. It is available for
<a href="http://ioos.github.com/system-test/downloads/
notebooks/%s.ipynb">download</a>. You can also try an interactive version on
<a href="http://mybinder.org/repo/ioos/system-test/">binder</a>.</p>
<p></p>
""" % (name)
The goal of this post is to investigate if it is possible to query the NGDC CSW Catalog to extract records matching an IOOS RA acronym, like SECOORA for example.
In the cell above we do the usual: instantiate a Catalogue Service Web (csw
) using the NGDC catalog endpoint.
In [3]:
from owslib.csw import CatalogueServiceWeb
endpoint = 'http://www.ngdc.noaa.gov/geoportal/csw'
csw = CatalogueServiceWeb(endpoint, timeout=30)
We need a list of all the Regional Associations we know.
In [4]:
ioos_ras = ['AOOS', # Alaska
'CaRA', # Caribbean
'CeNCOOS', # Central and Northern California
'GCOOS', # Gulf of Mexico
'GLOS', # Great Lakes
'MARACOOS', # Mid-Atlantic
'NANOOS', # Pacific Northwest
'NERACOOS', # Northeast Atlantic
'PacIOOS', # Pacific Islands
'SCCOOS', # Southern California
'SECOORA'] # Southeast Atlantic
To streamline the query we can create a function that instantiate the fes
filter and returns the records.
In [5]:
from owslib.fes import PropertyIsEqualTo
def query_ra(csw, ra='SECOORA'):
q = PropertyIsEqualTo(propertyname='apiso:Keywords', literal=ra)
csw.getrecords2(constraints=[q], maxrecords=100, esn='full')
return csw
In [6]:
for ra in ioos_ras:
csw = query_ra(csw, ra)
ret = csw.results['returned']
word = 'records' if ret > 1 else 'record'
print("{0:>8} has {1:>3} {2}".format(ra, ret, word))
csw.records.clear()
I would not trust those number completely. Surely some of the RA listed above have more than 0/1 record.
Note that we have more information in the csw.records
.
Let's inspect one of SECOORA's stations for example.
In [7]:
csw = query_ra(csw, 'SECOORA')
key = csw.records.keys()[0]
print(key)
We can verify the station type, title, and last date of modification.
In [8]:
station = csw.records[key]
station.type, station.title, station.modified
Out[8]:
The subjects
field contains the variables and some useful keywords.
In [9]:
station.subjects
Out[9]:
And we can access the full XML
description for the station.
In [10]:
print(station.xml)
This query is very simple, but also very powerful. We can quickly assess the data available for a certain Regional Association data with just a few line of code.
You can see the original notebook here.
In [11]:
HTML(html)
Out[11]: