Quick introduction

erddapy can be installed with conda

conda install --channel conda-forge erddapy

or pip

pip install erddapy

First we need to instantiate the erddapy server object.


In [1]:
from erddapy import ERDDAP


e = ERDDAP(
    server="https://gliders.ioos.us/erddap",
    protocol="tabledap",
    response="csv",
)

Now we can populate the object with constraints, the variables of interest, and the dataset id.


In [2]:
e.dataset_id = "whoi_406-20160902T1700"

e.constraints = {
    "time>=": "2016-07-10T00:00:00Z",
    "time<=": "2017-02-10T00:00:00Z",
    "latitude>=": 38.0,
    "latitude<=": 41.0,
    "longitude>=": -72.0,
    "longitude<=": -69.0,
}

e.variables = [
    "depth",
    "latitude",
    "longitude",
    "salinity",
    "temperature",
    "time",
]


url = e.get_download_url()

print(url)


https://gliders.ioos.us/erddap/tabledap/whoi_406-20160902T1700.csv?depth,latitude,longitude,salinity,temperature,time&time>=1468108800.0&time<=1486684800.0&latitude>=38.0&latitude<=41.0&longitude>=-72.0&longitude<=-69.0

In [3]:
import pandas as pd


df = e.to_pandas(
    index_col="time (UTC)",
    parse_dates=True,
).dropna()

df.head()


Out[3]:
depth (m) latitude (degrees_north) longitude (degrees_east) salinity (1) temperature (Celsius)
time (UTC)
2016-09-03 20:15:46+00:00 5.35 40.990881 -71.12439 32.245422 20.6620
2016-09-03 20:15:46+00:00 6.09 40.990881 -71.12439 32.223183 20.6512
2016-09-03 20:15:46+00:00 6.72 40.990881 -71.12439 32.237950 20.6047
2016-09-03 20:15:46+00:00 7.37 40.990881 -71.12439 32.235470 20.5843
2016-09-03 20:15:46+00:00 8.43 40.990881 -71.12439 32.224503 20.5691

In [4]:
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

fig, ax = plt.subplots(figsize=(17, 2))
cs = ax.scatter(
    df.index,
    df["depth (m)"],
    s=15,
    c=df["temperature (Celsius)"],
    marker="o",
    edgecolor="none"
)

ax.invert_yaxis()
ax.set_xlim(df.index[0], df.index[-1])
xfmt = mdates.DateFormatter("%H:%Mh\n%d-%b")
ax.xaxis.set_major_formatter(xfmt)

cbar = fig.colorbar(cs, orientation="vertical", extend="both")
cbar.ax.set_ylabel("Temperature ($^\circ$C)")
ax.set_ylabel("Depth (m)");


/home/filipe/miniconda3/envs/ERDDAPY/lib/python3.7/site-packages/pandas/plotting/_matplotlib/converter.py:102: FutureWarning: Using an implicitly registered datetime converter for a matplotlib plotting method. The converter was registered by pandas on import. Future versions of pandas will require you to explicitly register matplotlib converters.

To register the converters:
	>>> from pandas.plotting import register_matplotlib_converters
	>>> register_matplotlib_converters()
  warnings.warn(msg, FutureWarning)

Longer introduction

First we need to instantiate the ERDDAP URL constructor for a server. In this example we will use https://gliders.ioos.us/erddap.


In [5]:
from erddapy import ERDDAP


e = ERDDAP(server="https://gliders.ioos.us/erddap")

What are the methods/attributes available?


In [6]:
[method for method in dir(e) if not method.startswith("_")]


Out[6]:
['constraints',
 'dataset_id',
 'get_categorize_url',
 'get_download_url',
 'get_info_url',
 'get_search_url',
 'get_var_by_attr',
 'params',
 'protocol',
 'requests_kwargs',
 'response',
 'server',
 'to_iris',
 'to_pandas',
 'to_xarray',
 'variables']

All the get_<methods> will return a valid ERDDAP URL for the requested response and options. erddapy will raise an error is if URL HEADER cannot be validated.


In [7]:
print(e.get_search_url(search_for="all"))


https://gliders.ioos.us/erddap/search/advanced.html?page=1&itemsPerPage=1000&protocol=(ANY)&cdm_data_type=(ANY)&institution=(ANY)&ioos_category=(ANY)&keywords=(ANY)&long_name=(ANY)&standard_name=(ANY)&variableName=(ANY)&minLon=(ANY)&maxLon=(ANY)&minLat=(ANY)&maxLat=(ANY)&minTime=(ANY)&maxTime=(ANY)&searchFor=all

There are many responses available, see the docs for griddap and tabledap respectively. The most useful ones for Pythonistas are the .csv and .nc that can be read with pandas and netCDF4-python respectively.

Let's load the csv reponse directly with pandas.


In [8]:
import pandas as pd


df = pd.read_csv(e.get_search_url(response="csv", search_for="all"))

In [9]:
print(
    f'We have {len(set(df["tabledap"].dropna()))} '
    f'tabledap, {len(set(df["griddap"].dropna()))} '
    f'griddap, and {len(set(df["wms"].dropna()))} wms endpoints.'
)


We have 544 tabledap, 0 griddap, and 0 wms endpoints.

We can refine our search by providing some constraints.


In [10]:
def show_iframe(src):
    """Helper function to show HTML returns."""
    from IPython.display import HTML
    iframe = f'<iframe src="{src}" width="100%" height="950"></iframe>'
    return HTML(iframe)

Let's narrow the search area, time span, and look for sea_water_temperature only.


In [11]:
kw = {
    "standard_name": "sea_water_temperature",
    "min_lon": -72.0,
    "max_lon": -69.0,
    "min_lat": 38.0,
    "max_lat": 41.0,
    "min_time": "2016-07-10T00:00:00Z",
    "max_time": "2017-02-10T00:00:00Z",
    "cdm_data_type": "trajectoryprofile"
}

search_url = e.get_search_url(response="html", **kw)

show_iframe(search_url)


/home/filipe/miniconda3/envs/ERDDAPY/lib/python3.7/site-packages/IPython/core/display.py:694: UserWarning: Consider using IPython.display.IFrame instead
  warnings.warn("Consider using IPython.display.IFrame instead")
Out[11]:

We can see that the search form above was correctly populated with the constraints we provided.

Let us change the response from .html to .csv, so we load it as a pandas.DataFrame, and inspect what are the Dataset IDs available for download.


In [12]:
search_url = e.get_search_url(response="csv", **kw)
search = pd.read_csv(search_url)
gliders = search["Dataset ID"].values

gliders_list = "\n".join(gliders)
print(f"Found {len(gliders)} Glider Datasets:\n{gliders_list}")


Found 16 Glider Datasets:
blue-20160818T1448
cp_335-20170116T1459
cp_336-20161011T0027
cp_336-20170116T1254
cp_340-20160809T0230
cp_374-20160529T0035
cp_374-20161011T0106
cp_376-20160527T2050
cp_379-20170116T1246
cp_380-20161011T2046
cp_387-20160404T1858
cp_388-20160809T1409
cp_389-20161011T2040
silbo-20160413T1534
sp022-20170209T1616
whoi_406-20160902T1700

Now that we know the Dataset IDs we can explore their metadata with the get_info_url method.


In [13]:
info_url = e.get_info_url(dataset_id=gliders[0], response="html")

show_iframe(src=info_url)


Out[13]:

Again, with the csv response, we can manipulate the metadata and find the variables that have the cdm_profile_variables attribute.


In [14]:
info_url = e.get_info_url(dataset_id=gliders[0], response='csv')

info = pd.read_csv(info_url)

info.head()


Out[14]:
Row Type Variable Name Attribute Name Data Type Value
0 attribute NC_GLOBAL acknowledgement String This deployment supported by NOAA U.S. IOOS
1 attribute NC_GLOBAL cdm_data_type String TrajectoryProfile
2 attribute NC_GLOBAL cdm_profile_variables String time_uv,lat_uv,lon_uv,u,v,profile_id,time,lati...
3 attribute NC_GLOBAL cdm_trajectory_variables String trajectory,wmo_id
4 attribute NC_GLOBAL comment String Glider deployed by the University of Massachus...

In [15]:
"".join(info.loc[info["Attribute Name"] == "cdm_profile_variables", "Value"])


Out[15]:
'time_uv,lat_uv,lon_uv,u,v,profile_id,time,latitude,longitude'

Selecting variables by theirs attributes is such a common operation that erddapy brings its own method to simplify this task.

The get_var_by_attr method is inspired by netCDF4-python's get_variables_by_attributes however, because erddapy is operating on remote serves, it will return the variable names instead of the actual variables.

Here we check what is/are the variable(s) associated with the standard_name used in the search.

Note that get_var_by_attr caches the last response in case the user needs to make multiple requests, but it will loose its state when a new request is made.

(See the execution times below.)


In [16]:
%%time

# First one, slow.
e.get_var_by_attr(
    dataset_id="whoi_406-20160902T1700",
    standard_name="sea_water_temperature"
)


CPU times: user 193 ms, sys: 2.81 ms, total: 195 ms
Wall time: 1.73 s
Out[16]:
['temperature']

In [17]:
%%time

# Second one on the same glider, a little bit faster.
e.get_var_by_attr(
    dataset_id="whoi_406-20160902T1700",
    standard_name="sea_water_practical_salinity"
)


CPU times: user 55 µs, sys: 15 µs, total: 70 µs
Wall time: 73.9 µs
Out[17]:
['salinity']

In [18]:
%%time

# New one, slow again.
e.get_var_by_attr(
    dataset_id="cp_336-20170116T1254",
    standard_name="sea_water_practical_salinity"
)


CPU times: user 123 ms, sys: 1.71 ms, total: 124 ms
Wall time: 1.93 s
Out[18]:
['salinity']

Another way to browse datasets is via the categorize URL. In the example below we can get all the standard_names available in the dataset with a single request.


In [19]:
url = e.get_categorize_url(
    categorize_by="standard_name",
    response="csv"
)

pd.read_csv(url)["Category"]


Out[19]:
0                                                 _null
1     concentration_of_colored_dissolved_organic_mat...
2                              conductivity_status_flag
3                                   density_status_flag
4                                                 depth
5                                     depth_status_flag
6     downwelling_photosynthetic_photon_spherical_ir...
7                           eastward_sea_water_velocity
8               eastward_sea_water_velocity_status_flag
9          fractional_saturation_of_oxygen_in_sea_water
10    fractional_saturation_of_oxygen_in_sea_water_s...
11                                             latitude
12                                 latitude_status_flag
13                                            longitude
14                                longitude_status_flag
15     mass_concentration_of_chlorophyll_a_in_sea_water
16       mass_concentration_of_chlorophyll_in_sea_water
17    mass_concentration_of_chlorophyll_in_sea_water...
18    mole_concentration_of_dissolved_molecular_oxyg...
19    mole_concentration_of_dissolved_molecular_oxyg...
20           moles_of_oxygen_per_unit_mass_in_sea_water
21    moles_of_oxygen_per_unit_mass_in_sea_water_sta...
22                             north_sea_water_velocity
23                         northward_sea_water_velocity
24             northward_sea_water_velocity_status_flag
25                                             pressure
26                                 pressure_status_flag
27                                 radiation_wavelength
28                                 salinity_status_flag
29                                    sea_water_density
30                        sea_water_density_status_flag
31                    sea_water_electrical_conductivity
32        sea_water_electrical_conductivity_status_flag
33        sea_water_electrival_conductivity_status_flag
34                          sea_water_potential_density
35                      sea_water_potential_temperature
36                         sea_water_practical_salinity
37             sea_water_practical_salinity_status_flag
38                                   sea_water_pressure
39                       sea_water_pressure_status_flag
40                                   sea_water_salinity
41                       sea_water_salinity_status_flag
42                                    sea_water_sigma_t
43                                sea_water_temperature
44                    sea_water_temperature_status_flag
45                      sea_water_turbidity_status_flag
46                          speed_of_sound_in_sea_water
47                              temperature_status_flag
48                             temperuature_status_flag
49                                                 time
50                                     time_status_flag
51    volume_absorption_coefficient_of_radiative_flu...
52    volume_absorption_coefficient_of_radiative_flu...
53    volume_backwards_scattering_coefficient_of_rad...
54    volume_scattering_coefficient_of_radiative_flu...
Name: Category, dtype: object

We can also pass a value to filter the categorize results.


In [20]:
url = e.get_categorize_url(
    categorize_by="institution",
    value="woods_hole_oceanographic_institution",
    response="csv"
)

df = pd.read_csv(url)

In [21]:
whoi_gliders = df.loc[~df["tabledap"].isnull(), "Dataset ID"].tolist()

whoi_gliders


Out[21]:
['sp007-20170427T1652',
 'sp010-20150409T1524',
 'sp010-20170707T1647',
 'sp010-20180620T1455',
 'sp022-20170209T1616',
 'sp022-20170802T1414',
 'sp022-20180124T1514',
 'sp022-20180422T1229',
 'sp022-20180912T1553',
 'sp055-20150716T1359',
 'sp062-20171116T1557',
 'sp062-20190201T1350',
 'sp065-20151001T1507',
 'sp065-20180310T1828',
 'sp065-20181015T1349',
 'sp065-20190517T1530',
 'sp066-20151217T1624',
 'sp066-20160818T1505',
 'sp066-20170416T1744',
 'sp066-20171129T1616',
 'sp066-20180629T1411',
 'sp066-20190301T1640',
 'sp066-20190724T1532',
 'sp069-20170907T1531',
 'sp069-20180411T1516',
 'sp069-20181109T1607',
 'whoi_406-20160902T1700']

Now it is easy to filter non WHOI gliders from our original glider search.


In [22]:
gliders = [glider for glider in gliders if glider in whoi_gliders]
gliders


Out[22]:
['sp022-20170209T1616', 'whoi_406-20160902T1700']

With Python it is easy to loop over all the dataset_ids for the variables with standard_names


In [23]:
variables = [
    e.get_var_by_attr(
        dataset_id=glider,
        standard_name=lambda v: v is not None
    )
    for glider in gliders
]

We can construct a set with the common variables in those dataset_ids.


In [24]:
common_variables = set(variables[0]).intersection(*variables[1:])

common_variables


Out[24]:
{'conductivity',
 'conductivity_qc',
 'density',
 'density_qc',
 'depth',
 'depth_qc',
 'lat_uv',
 'lat_uv_qc',
 'latitude',
 'latitude_qc',
 'lon_uv',
 'lon_uv_qc',
 'longitude',
 'longitude_qc',
 'precise_lat',
 'precise_lon',
 'precise_time',
 'precise_time_qc',
 'pressure',
 'pressure_qc',
 'salinity',
 'salinity_qc',
 'temperature',
 'temperature_qc',
 'time',
 'time_qc',
 'time_uv',
 'time_uv_qc',
 'u',
 'u_qc',
 'v',
 'v_qc'}

Last, but not least, the download endpoint!

It is important to note that the download constraints are based on the variables names and not the standardized ones for the get_search_url method.


In [25]:
constraints = {
    "longitude>=": kw["min_lon"],
    "longitude<=": kw["max_lon"],
    "latitude>=": kw["min_lat"],
    "latitude<=": kw["max_lat"],
    "time>=": kw["min_time"],
    "time<=": kw["max_time"],
}



download_url = e.get_download_url(
    dataset_id=gliders[0],
    protocol="tabledap",
    variables=common_variables,
    constraints=constraints
)

print(download_url)


https://gliders.ioos.us/erddap/tabledap/sp022-20170209T1616.html?depth_qc,precise_lat,density_qc,time_uv,precise_time,u,longitude_qc,precise_lon,salinity_qc,v_qc,lon_uv,salinity,time_uv_qc,u_qc,temperature,latitude,longitude,v,density,temperature_qc,conductivity,lat_uv,precise_time_qc,pressure_qc,depth,pressure,time_qc,conductivity_qc,lat_uv_qc,latitude_qc,time,lon_uv_qc&longitude>=-72.0&longitude<=-69.0&latitude>=38.0&latitude<=41.0&time>=1468108800.0&time<=1486684800.0

Putting everything in DataFrame objects.


In [26]:
from requests.exceptions import HTTPError


def download_csv(url):
    return pd.read_csv(
        url,
        index_col="time",
        parse_dates=True,
        skiprows=[1],
    )


dfs = {}
for glider in gliders:
    try:
        download_url = e.get_download_url(
            dataset_id=glider,
            protocol="tabledap",
            variables=common_variables,
            response="csv",
            constraints=constraints
        )
    except HTTPError:
        print(f"Failed to download {glider}.")
        continue
    dfs.update({glider: download_csv(download_url)})


Failed to download sp022-20170209T1616.

The glider datasets should be masked automatically but we found that is not true. The cell below applies the mask as described by the data QC flag.

Finally let's see some figures!


In [27]:
%matplotlib inline
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
from cartopy.mpl.ticker import LongitudeFormatter, LatitudeFormatter


def make_map(extent):
    fig, ax = plt.subplots(
        figsize=(9, 9),
        subplot_kw=dict(projection=ccrs.PlateCarree())
    )
    ax.coastlines(resolution="10m")
    ax.set_extent(extent)

    ax.set_xticks([extent[0], extent[1]], crs=ccrs.PlateCarree())
    ax.set_yticks([extent[2], extent[3]], crs=ccrs.PlateCarree())
    lon_formatter = LongitudeFormatter(zero_direction_label=True)
    lat_formatter = LatitudeFormatter()
    ax.xaxis.set_major_formatter(lon_formatter)
    ax.yaxis.set_major_formatter(lat_formatter)

    return fig, ax


dx = dy = 0.5
extent = kw["min_lon"]-dx, kw["max_lon"]+dx, kw["min_lat"]+dy, kw["max_lat"]+dy

fig, ax = make_map(extent)
for glider, df in dfs.items():
    ax.plot(df["longitude"], df["latitude"], label=glider)

leg = ax.legend()



In [28]:
def glider_scatter(df, ax, glider):
    ax.scatter(df["temperature"], df["salinity"],
               s=10, alpha=0.5, label=glider)

fig, ax = plt.subplots(figsize=(9, 9))
ax.set_ylabel("salinity")
ax.set_xlabel("temperature")
ax.grid(True)

for glider, df in dfs.items():
    glider_scatter(df, ax, glider)

leg = ax.legend()


Extras

OPeNDAP response


In [29]:
e.constraints = None
e.protocol = "tabledap"

opendap_url = e.get_download_url(
    dataset_id="whoi_406-20160902T1700",
    response="opendap",
)

print(opendap_url)


https://gliders.ioos.us/erddap/tabledap/whoi_406-20160902T1700

In [30]:
from netCDF4 import Dataset


with Dataset(opendap_url) as nc:
    print(nc.summary)


Slocum glider dataset gathered as part of the TEMPESTS (The Experiment to Measure and Predict East coast STorm Strength), funded by NOAA through CINAR (Cooperative Institute for the North Atlantic Region).

netCDF "file-like" to xarray

open_dataset will download a temporary file, so be careful with the constraints to avoid downloading several gigabytes!


In [31]:
e.dataset_id = "cp_336-20170116T1254"
e.response = "nc"
e.variables = common_variables
e.constraints = constraints

download_url = e.get_download_url()

In [32]:
import requests


def humansize(nbytes):
    suffixes = ["B", "KB", "MB", "GB", "TB", "PB"]
    k = 0
    while nbytes >= 1024 and k < len(suffixes)-1:
        nbytes /= 1024.
        k += 1
    f = ("%.2f" % nbytes).rstrip("0").rstrip(".")
    return "%s %s" % (f, suffixes[k])

r = requests.head(download_url)
nbytes = float(r.headers["Content-Length"])
humansize(nbytes)


Out[32]:
'600.05 KB'

That is the uncompressed size, we are OK because the download will be less than that, ERDDAP streams gzip'ed data.


In [33]:
r.headers["Content-Encoding"]


Out[33]:
'gzip'

In [34]:
ds = e.to_xarray(decode_times=False)

ds


Out[34]:
<xarray.Dataset>
Dimensions:          (row: 16232)
Coordinates:
    time_uv          (row) float64 ...
    lon_uv           (row) float64 ...
    lat_uv           (row) float64 ...
Dimensions without coordinates: row
Data variables:
    depth_qc         (row) float32 ...
    precise_lat      (row) float64 ...
    density_qc       (row) float32 ...
    precise_time     (row) float64 ...
    u                (row) float64 ...
    longitude_qc     (row) float32 ...
    precise_lon      (row) float64 ...
    salinity_qc      (row) float32 ...
    v_qc             (row) float32 ...
    salinity         (row) float32 ...
    time_uv_qc       (row) float32 ...
    u_qc             (row) float32 ...
    temperature      (row) float32 ...
    latitude         (row) float64 ...
    longitude        (row) float64 ...
    v                (row) float64 ...
    density          (row) float32 ...
    temperature_qc   (row) float32 ...
    conductivity     (row) float32 ...
    precise_time_qc  (row) float32 ...
    pressure_qc      (row) float32 ...
    depth            (row) float32 ...
    pressure         (row) float32 ...
    time_qc          (row) float32 ...
    conductivity_qc  (row) float32 ...
    lat_uv_qc        (row) float32 ...
    latitude_qc      (row) float32 ...
    time             (row) float64 ...
    lon_uv_qc        (row) float32 ...
Attributes:
    acknowledgement:               Funding provided by the National Science F...
    cdm_data_type:                 TrajectoryProfile
    cdm_profile_variables:         time_uv,lat_uv,lon_uv,u,v,profile_id,time,...
    cdm_trajectory_variables:      trajectory,wmo_id
    contributor_name:              Paul Matthias,Peter Brickley,Sheri White,D...
    contributor_role:              CGSN Program Manager,CGSN Operations Engin...
    Conventions:                   Unidata Dataset Discovery v1.0, COARDS, CF...
    creator_email:                 kerfoot@marine.rutgers.edu
    creator_name:                  John Kerfoot
    creator_url:                   http://rucool.marine.rutgers.edu
    date_created:                  2017-04-19T14:33:41Z
    date_issued:                   2017-04-19T14:33:41Z
    date_modified:                 2017-04-19T14:33:41Z
    deployment_number:             4
    Easternmost_Easting:           -69.98303682074565
    featureType:                   TrajectoryProfile
    format_version:                https://github.com/ioos/ioosngdac/tree/mas...
    geospatial_lat_max:            39.91726417227544
    geospatial_lat_min:            39.32370673037986
    geospatial_lat_units:          degrees_north
    geospatial_lon_max:            -69.98303682074565
    geospatial_lon_min:            -71.18259602604894
    geospatial_lon_units:          degrees_east
    geospatial_vertical_max:       976.756
    geospatial_vertical_min:       -0.03969577
    geospatial_vertical_positive:  down
    geospatial_vertical_units:     m
    history:                       2017-04-19T14:33:35Z: Data Source /Users/k...
    id:                            cp_336-20170116T125400Z
    infoUrl:                       http://data.ioos.us/gliders/erddap/
    institution:                   Ocean Observatories Initiative
    ioos_dac_checksum:             f42b729c0bf19af1b7229b21350ebaaf
    ioos_dac_completed:            False
    keywords:                      AUVS > Autonomous Underwater Vehicles, Oce...
    keywords_vocabulary:           GCMD Science Keywords
    license:                       All OOI data including data from OOI core ...
    Metadata_Conventions:          Unidata Dataset Discovery v1.0, COARDS, CF...
    metadata_link:                 http://ooi.visualocean.net/sites/view/CP05...
    naming_authority:              org.oceanobservatories
    Northernmost_Northing:         39.91726417227544
    platform_type:                 Slocum Glider
    processing_level:              Contains any/all of the following: L0 Data...
    project:                       Ocean Observatories Initiative
    publisher_email:               kerfoot@marine.rutgers.edu
    publisher_name:                John Kerfoot
    publisher_url:                 http://rucool.marine.rutgers.edu
    references:                    http://oceanobservatories.org/
    sea_name:                      Mid-Atlantic Bight
    source:                        Observational data from a profiling glider
    sourceUrl:                     (local files)
    Southernmost_Northing:         39.32370673037986
    standard_name_vocabulary:      CF Standard Name Table v27
    subsetVariables:               trajectory,wmo_id,time_uv,lat_uv,lon_uv,u,...
    summary:                       The Pioneer Array is located off the coast...
    time_coverage_end:             2017-02-09T23:03:25Z
    time_coverage_start:           2017-01-16T13:03:04Z
    title:                         cp_336-20170116T1254
    Westernmost_Easting:           -71.18259602604894

In [35]:
ds["temperature"]


Out[35]:
<xarray.DataArray 'temperature' (row: 16232)>
array([14.3976, 14.4236, 14.4596, ...,  4.4004,  4.3975,  4.3978],
      dtype=float32)
Coordinates:
    time_uv  (row) float64 ...
    lon_uv   (row) float64 ...
    lat_uv   (row) float64 ...
Dimensions without coordinates: row
Attributes:
    _ChunkSizes:          1
    actual_range:         [ 0.     17.2652]
    ancillary_variables:  temperature_qc
    colorBarMaximum:      32.0
    colorBarMinimum:      0.0
    coordinates:          time lat lon depth
    instrument:           instrument_ctd
    ioos_category:        Temperature
    long_name:            Sea Water Temperature
    observation_type:     measured
    platform:             platform
    source_variable:      sci_water_temp
    standard_name:        sea_water_temperature
    units:                degree_Celsius
    valid_max:            40.0
    valid_min:            -5.0

In [36]:
import numpy as np


data = ds["temperature"].values
depth = ds["depth"].values

mask = ~np.ma.masked_invalid(data).mask

In [37]:
data = data[mask]
depth = depth[mask]
lon = ds["longitude"].values[mask]
lat = ds["latitude"].values[mask]

In [38]:
import warnings


with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    mask = depth <= 5

data = data[mask]
depth = depth[mask]
lon = lon[mask]
lat = lat[mask]

In [39]:
%matplotlib inline
import matplotlib.pyplot as plt
import cartopy.crs as ccrs


dx = dy = 1.5
extent = (
    ds.geospatial_lon_min-dx, ds.geospatial_lon_max+dx,
    ds.geospatial_lat_min-dy, ds.geospatial_lat_max+dy
)
fig, ax = make_map(extent)

cs = ax.scatter(lon, lat, c=data, s=50, alpha=0.5, edgecolor="none")
cbar = fig.colorbar(cs, orientation="vertical",
                    fraction=0.1, shrink=0.9, extend="both")
ax.coastlines("10m");


Or iris if the data is easier to navigate via the CF conventions data model.


In [40]:
import warnings

# Iris warnings are quire verbose!
with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    cubes = e.to_iris()

print(cubes)


0: longitude / (degrees)               (-- : 16232)
1: northward_sea_water_velocity / (m s-1) (-- : 16232)
2: latitude / (degrees)                (-- : 16232)
3: sea_water_practical_salinity / (unknown) (-- : 16232)
4: sea_water_density / (kg/m^3)        (-- : 16232)
5: time / (seconds since 1970-01-01T00:00:00Z) (-- : 16232)
6: eastward_sea_water_velocity / (m s-1) (-- : 16232)
7: sea_water_electrical_conductivity / (S m-1) (-- : 16232)
8: longitude Variable Quality Flag / (1) (-- : 16232)
9: precise_time Variable Quality Flag / (1) (-- : 16232)
10: latitude Variable Quality Flag / (1) (-- : 16232)
11: longitude / (degrees)               (-- : 16232)
12: sea_water_temperature / (degree_Celsius) (-- : 16232)
13: latitude / (degrees)                (-- : 16232)
14: sea_water_pressure / (dbar)         (-- : 16232)

In [41]:
cubes.extract_strict("sea_water_pressure")


Out[41]:
Sea Water Pressure (dbar) --
Shape 16232
Attributes
Conventions Unidata Dataset Discovery v1.0, COARDS, CF-1.6
Easternmost_Easting -69.98303682074565
Metadata_Conventions Unidata Dataset Discovery v1.0, COARDS, CF-1.6
Northernmost_Northing 39.91726417227544
Southernmost_Northing 39.32370673037986
Westernmost_Easting -71.18259602604894
_ChunkSizes 1
acknowledgement Funding provided by the National Science Foundation. Glider deployed by...
actual_range [-4.0000e-02 9.8658e+02]
cdm_data_type TrajectoryProfile
cdm_profile_variables time_uv,lat_uv,lon_uv,u,v,profile_id,time,latitude,longitude
cdm_trajectory_variables trajectory,wmo_id
colorBarMaximum 2000.0
colorBarMinimum 0.0
contributor_name Paul Matthias,Peter Brickley,Sheri White,Diana Wickman,John Kerfoot
contributor_role CGSN Program Manager,CGSN Operations Engineer,CGSN Operations Engineer,CGSN...
creator_email kerfoot@marine.rutgers.edu
creator_name John Kerfoot
creator_url http://rucool.marine.rutgers.edu
date_created 2017-04-19T14:33:41Z
date_issued 2017-04-19T14:33:41Z
date_modified 2017-04-19T14:33:41Z
deployment_number 4
featureType TrajectoryProfile
format_version https://github.com/ioos/ioosngdac/tree/master/nc/template/IOOS_Glider_...
geospatial_lat_max 39.91726417227544
geospatial_lat_min 39.32370673037986
geospatial_lat_units degrees_north
geospatial_lon_max -69.98303682074565
geospatial_lon_min -71.18259602604894
geospatial_lon_units degrees_east
geospatial_vertical_max 976.756
geospatial_vertical_min -0.03969577
geospatial_vertical_positive down
geospatial_vertical_units m
history 2017-04-19T14:33:35Z: Data Source /Users/kerfoot/datasets/ooi/dac/deployments/CP05MOAS-GL336-deployment0004-telemetered/nc-source/deployment0004_CP05MOAS-GL336-03-CTDGVM000-telemetered-ctdgv_m_glider_instrument_20170418T091545.141720-20170419T010002.022580.nc
2017-04-19T14 3:41Z:...
id cp_336-20170116T125400Z
infoUrl http://data.ioos.us/gliders/erddap/
institution Ocean Observatories Initiative
instrument instrument_ctd
ioos_category Pressure
ioos_dac_checksum f42b729c0bf19af1b7229b21350ebaaf
ioos_dac_completed False
keywords AUVS > Autonomous Underwater Vehicles, Oceans > Ocean Pressure > Water...
keywords_vocabulary GCMD Science Keywords
license All OOI data including data from OOI core sensors and all proposed sensors...
metadata_link http://ooi.visualocean.net/sites/view/CP05MOAS
naming_authority org.oceanobservatories
observation_type calculated
platform platform
platform_type Slocum Glider
positive down
processing_level Contains any/all of the following: L0 Data (Unprocessed, parsed data product...
project Ocean Observatories Initiative
publisher_email kerfoot@marine.rutgers.edu
publisher_name John Kerfoot
publisher_url http://rucool.marine.rutgers.edu
reference_datum sea-surface
references http://oceanobservatories.org/
sea_name Mid-Atlantic Bight
source Observational data from a profiling glider
sourceUrl (local files)
source_variable sci_water_pressure_dbar
standard_name_vocabulary CF Standard Name Table v27
subsetVariables trajectory,wmo_id,time_uv,lat_uv,lon_uv,u,v,profile_id,time,latitude,l...
summary The Pioneer Array is located off the coast of New England, south of Martha's...
time_coverage_end 2017-02-09T23:03:25Z
time_coverage_start 2017-01-16T13:03:04Z
title cp_336-20170116T1254
valid_max 2000.0
valid_min 0.0

This example is written in a Jupyter Notebook click here to download the notebook so you can run it locally, or click here to run a live instance of this notebook.