Exploring CSW access in Python using OWSLib with NODC Geoportal


In [ ]:
from IPython.core.display import HTML
HTML('<iframe src=http://www.nodc.noaa.gov/geoportal/ width=900 height=280></iframe>')

In [1]:
from owslib.csw import CatalogueServiceWeb

In [2]:
# connect to CSW, explore it's properties
endpoint = 'http://www.ngdc.noaa.gov/geoportal/csw' # NGDC Geoportal
#endpoint = 'http://data.nodc.noaa.gov/geoportal/csw'  # NODC Geoportal: collection level
    
#endpoint = 'http://geodiscover.cgdi.ca/wes/serviceManagerCSW/csw'  # NRCAN CUSTOM
#endpoint = 'http://geoport.whoi.edu/gi-cat/services/cswiso' # USGS Woods Hole GI_CAT
#endpoint = 'http://cida.usgs.gov/gdp/geonetwork/srv/en/csw' # USGS CIDA Geonetwork
#endpoint = 'http://www.nodc.noaa.gov/geoportal/csw'   # NODC Geoportal: granule level

csw = CatalogueServiceWeb(endpoint,timeout=30)
csw.version


Out[2]:
'2.0.2'

In [3]:
[op.name for op in csw.operations]


Out[3]:
['GetCapabilities',
 'DescribeRecord',
 'GetRecords',
 'GetRecordById',
 'Transaction']

In [4]:
#bbox=[-141,42,-52,84]
bbox=[-71.5, 39.5, -63.0, 46]
csw.getrecords(keywords=['temperature'],bbox=bbox,maxrecords=20)
#csw.getrecords(keywords=['sea_water_temperature'],maxrecords=20)
csw.results


/home/usgs/miniconda/envs/ioos/lib/python2.7/site-packages/owslib/csw.py:210: UserWarning: Please use the updated 'getrecords2' method instead of 'getrecords'.  
        The 'getrecords' method will be upgraded to use the 'getrecords2' parameters
        in a future version of OWSLib.
  in a future version of OWSLib.""")
Out[4]:
{'matches': 377, 'nextrecord': 21, 'returned': 20}

In [6]:
for rec,item in csw.records.iteritems():
    print item.title


CRUTEM4 Air Temperature Dataset
HADCRUT3 Combined Air Temperature/SST Anomaly
HADCRUT3 Combined Air Temperature/SST Anomaly
CRUTEM3 Air Temperature Anomaly
Global Sea Surface Temperature Analysis
CRUTEM4 Air Temperature Dataset
CRUTEM3 Air Temperature Anomaly
CRUTEM4 Air Temperature Dataset
CRUTEM4 Air Temperature Dataset
CRUTEM3 Air Temperature Anomaly
CRUTEM3 Air Temperature Anomaly
Analysed foundation sea surface temperature, global
Monthly version of HadISST sea surface temperature component
HADCRUT3 Combined Air Temperature/SST Anomaly
CRUTEM3 Air Temperature Anomaly
CRUTEM4 Air Temperature Dataset
CRUTEM4 Air Temperature Dataset
CRUTEM4 Air Temperature Dataset
Analysed foundation sea surface temperature, global
CRUTEM3 Air Temperature Anomaly

In [7]:
print(csw.records.keys())


['PSDgriddedData/cru/crutem4/var/air.mon.anom.nc', 'PSDgriddedData/cru/hadcrut3/std/air.mon.anom.biased2.5.nc', 'PSDgriddedData/cru/hadcrut3/std/air.mon.anom.mserror.nc', 'PSDgriddedData/cru/crutem3/std/air.mon.anom.samplingerror.nc', 'ghrsst.cfg.aggregation.fullAgg.aggregate__ghrsst_L4_GLOB_EUR_ODYSSEA.ncml', 'PSDgriddedData/cru/crutem4/std/air.mon.anom.nc', 'PSDgriddedData/cru/crutem3/std/air.mon.anom.nc', 'PSDgriddedData/cru/crutem4/std/air.mon.anom.stationerror.nc', 'PSDgriddedData/cru/crutem4/std/air.mon.anom.biased97.5.nc', 'PSDgriddedData/cru/crutem3/std/air.mon.anom.biased2.5.nc', 'PSDgriddedData/cru/crutem3/std/air.mon.anom.stationerror.nc', 'satellite.G1.ssta.1day', 'HadleyCenter.HadISST', 'PSDgriddedData/cru/hadcrut3/std/air.mon.anom.biased97.5.nc', 'PSDgriddedData/cru/crutem3/std/air.mon.anom.biased97.5.nc', 'PSDgriddedData/cru/crutem4/std/air.mon.anom.biased2.5.nc', 'PSDgriddedData/cru/crutem4/std/air.mon.anom.samplingerror.nc', 'PSDgriddedData/cru/crutem4/std/air.mon.anom.nobs.nc', 'satellite.GR.ssta.1day', 'PSDgriddedData/cru/crutem3/std/air.mon.anom.nobs.nc']

In [8]:
# choose a sample record
a=csw.records['satellite.G1.ssta.1day']

In [9]:
print a.title


Analysed foundation sea surface temperature, global

In [10]:
# unfortunately the "uris" property is empty
print a.uris


[]

In [11]:
# yet I can see the URIs here:
print a.xml


<csw:SummaryRecord xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcmiBox="http://dublincore.org/documents/2000/07/11/dcmi-box/" xmlns:dct="http://purl.org/dc/terms/" xmlns:gml="http://www.opengis.net/gml" xmlns:ows="http://www.opengis.net/ows" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<dc:identifier scheme="urn:x-esri:specification:ServiceType:ArcIMS:Metadata:FileID">satellite.G1.ssta.1day</dc:identifier>
<dc:identifier scheme="urn:x-esri:specification:ServiceType:ArcIMS:Metadata:DocID">{152D8C15-039C-4520-A7E6-CC39E5D6E8DB}</dc:identifier>
<dc:title>Analysed foundation sea surface temperature, global</dc:title>
<dc:type scheme="urn:x-esri:specification:ServiceType:ArcIMS:Metadata:ContentType">downloadableData</dc:type>
<dc:type scheme="urn:x-esri:specification:ServiceType:ArcIMS:Metadata:ContentType">liveData</dc:type>
<dc:subject>NOAA CoastWatch, West Coast Node</dc:subject>
<dc:subject>sea_surface_temperature</dc:subject>
<dc:subject>latitude</dc:subject>
<dc:subject>longitude</dc:subject>
<dc:subject>time</dc:subject>
<dc:subject>climatologyMeteorologyAtmosphere</dc:subject>
<dct:modified>2013-04-07T05:42:48+00:00</dct:modified>
<dct:abstract>The through-cloud capabilities of microwave radiometers provide a valuable picture of global sea surface temperature (SST). To utilize this, scientists at Remote Sensing Systems have calculated a daily, Optimally Interpolated (OI) SST product at quarter degree (~25 kilometer) resolution. This product is ideal for research activities in which a complete, daily SST map is more desirable than one with missing data due to orbital gaps or environmental conditions precluding SST retrieval. Improved global daily NRT SSTs should be useful for a wide range of scientific and operational activities. The addition of SST derived from Ifrared (IR) measurements allows higher spatial resolution, and SST near land.However, IR input is less accurate than MW due to cloud contamination. Blending MW and IR enables greater coverage and higher accuracy than IR only SSTs, but current OI does not completely eliminate cloud contamination inherent to IR SSTs</dct:abstract>
<dct:references scheme="urn:x-esri:specification:ServiceType:ArcIMS:Metadata:Document">http://www.nodc.noaa.gov/geoportal/csw?getxml=%7B152D8C15-039C-4520-A7E6-CC39E5D6E8DB%7D</dct:references>
<dct:references scheme="urn:x-esri:specification:ServiceType:OPeNDAP">http://oceanwatch.pfeg.noaa.gov/thredds/dodsC/satellite/G1/ssta/1day</dct:references>
<dct:references scheme="urn:x-esri:specification:ServiceType:WCS">http://oceanwatch.pfeg.noaa.gov/thredds/wcs/satellite/G1/ssta/1day?service=WCS&amp;version=1.0.0&amp;request=GetCapabilities</dct:references>
<dct:references scheme="urn:x-esri:specification:ServiceType:WCT">http://www.ncdc.noaa.gov/oa/wct/wct-jnlp-beta.php?singlefile=http://oceanwatch.pfeg.noaa.gov/thredds/dodsC/satellite/G1/ssta/1day</dct:references>
<ows:WGS84BoundingBox>
<ows:LowerCorner>-179.9560546875 -89.9560546875</ows:LowerCorner>
<ows:UpperCorner>179.9560546875 89.9560546875</ows:UpperCorner>
</ows:WGS84BoundingBox>
<ows:BoundingBox>
<ows:LowerCorner>-179.9560546875 -89.9560546875</ows:LowerCorner>
<ows:UpperCorner>179.9560546875 89.9560546875</ows:UpperCorner>
</ows:BoundingBox>
</csw:SummaryRecord>


In [12]:
# lets look at the references
a.references


Out[12]:
[{'scheme': 'urn:x-esri:specification:ServiceType:ArcIMS:Metadata:Document',
  'url': 'http://www.nodc.noaa.gov/geoportal/csw?getxml=%7B152D8C15-039C-4520-A7E6-CC39E5D6E8DB%7D'},
 {'scheme': 'urn:x-esri:specification:ServiceType:OPeNDAP',
  'url': 'http://oceanwatch.pfeg.noaa.gov/thredds/dodsC/satellite/G1/ssta/1day'},
 {'scheme': 'urn:x-esri:specification:ServiceType:WCS',
  'url': 'http://oceanwatch.pfeg.noaa.gov/thredds/wcs/satellite/G1/ssta/1day?service=WCS&version=1.0.0&request=GetCapabilities'},
 {'scheme': 'urn:x-esri:specification:ServiceType:WCT',
  'url': 'http://www.ncdc.noaa.gov/oa/wct/wct-jnlp-beta.php?singlefile=http://oceanwatch.pfeg.noaa.gov/thredds/dodsC/satellite/G1/ssta/1day'}]

In [13]:
# get specific ServiceType URL from records
def service_urls(records,service_string='urn:x-esri:specification:ServiceType:OPeNDAP'):
    urls=[]
    for key,rec in records.iteritems():
        #create a generator object, and iterate through it until the match is found
        #if not found, gets the default value (here "none")
        url = next((d['url'] for d in rec.references if d['scheme'] == service_string), None)
        if url is not None:
            urls.append(url)
    return urls

In [14]:
dap_urls = service_urls(csw.records,service_string='urn:x-esri:specification:ServiceType:OPeNDAP')
print dap_urls


['http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/cru/crutem3/std/air.mon.anom.samplingerror.nc', 'http://data.nodc.noaa.gov/thredds/dodsC/ghrsst/cfg/aggregation/fullAgg/aggregate__ghrsst_L4_GLOB_EUR_ODYSSEA.ncml', 'http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/cru/crutem3/std/air.mon.anom.nc', 'http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/cru/crutem3/std/air.mon.anom.biased2.5.nc', 'http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/cru/crutem3/std/air.mon.anom.stationerror.nc', 'http://oceanwatch.pfeg.noaa.gov/thredds/dodsC/satellite/G1/ssta/1day', 'http://oceanwatch.pfeg.noaa.gov/thredds/dodsC/HadleyCenter/HadISST', 'http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/cru/crutem3/std/air.mon.anom.biased97.5.nc', 'http://oceanwatch.pfeg.noaa.gov/thredds/dodsC/satellite/GR/ssta/1day', 'http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/cru/crutem3/std/air.mon.anom.nobs.nc']

In [15]:
# find all the WMS ServiceType URLs
wms_urls = service_urls(csw.records,service_string='urn:x-esri:specification:ServiceType:WMS')
print wms_urls


['http://www.esrl.noaa.gov/psd/thredds/wms/Datasets/cru/crutem4/var/air.mon.anom.nc?service=WMS&version=1.3.0&request=GetCapabilities', 'http://www.esrl.noaa.gov/psd/thredds/wms/Datasets/cru/hadcrut3/std/air.mon.anom.biased2.5.nc?service=WMS&version=1.3.0&request=GetCapabilities', 'http://www.esrl.noaa.gov/psd/thredds/wms/Datasets/cru/hadcrut3/std/air.mon.anom.mserror.nc?service=WMS&version=1.3.0&request=GetCapabilities', 'http://www.esrl.noaa.gov/psd/thredds/wms/Datasets/cru/crutem4/std/air.mon.anom.nc?service=WMS&version=1.3.0&request=GetCapabilities', 'http://www.esrl.noaa.gov/psd/thredds/wms/Datasets/cru/crutem4/std/air.mon.anom.stationerror.nc?service=WMS&version=1.3.0&request=GetCapabilities', 'http://www.esrl.noaa.gov/psd/thredds/wms/Datasets/cru/crutem4/std/air.mon.anom.biased97.5.nc?service=WMS&version=1.3.0&request=GetCapabilities', 'http://www.esrl.noaa.gov/psd/thredds/wms/Datasets/cru/hadcrut3/std/air.mon.anom.biased97.5.nc?service=WMS&version=1.3.0&request=GetCapabilities', 'http://www.esrl.noaa.gov/psd/thredds/wms/Datasets/cru/crutem4/std/air.mon.anom.biased2.5.nc?service=WMS&version=1.3.0&request=GetCapabilities', 'http://www.esrl.noaa.gov/psd/thredds/wms/Datasets/cru/crutem4/std/air.mon.anom.samplingerror.nc?service=WMS&version=1.3.0&request=GetCapabilities', 'http://www.esrl.noaa.gov/psd/thredds/wms/Datasets/cru/crutem4/std/air.mon.anom.nobs.nc?service=WMS&version=1.3.0&request=GetCapabilities']

In [18]:
a.uris


Out[18]:
[]

In [19]:
type a.uris


  File "<ipython-input-19-0d160135903d>", line 1
    type a.uris
         ^
SyntaxError: invalid syntax

In [20]:
type(a.uris)


Out[20]:
list

In [ ]: