If this Notebook is running on a development system (where pip install oxyfloat has not been executed) — oxyfloat's parent directory needs to be added to the Python search path.
In [1]:
import sys
sys.path.insert(0, '../')
Import the ArgoData class and instatiate an ArgoData object (ad) with verbosity set to 2 so that we get INFO messages.
In [2]:
from oxyfloat import ArgoData
ad = ArgoData(verbosity=2)
You can now explore what methods the of object has by typing "ad." in a cell and pressing the tab key. One of the methods is get_oxy_floats(); to see what it does select it and press shift-tab with the cursor in the parentheses of "of.get_oxy_floats()". Let's get a list of all the floats that have been out for at least 340 days and print the length of that list.
In [3]:
%%time
floats340 = ad.get_oxy_floats_from_status(age_gte=340)
print('{} floats at least 340 days old'.format(len(floats340)))
If this the first time you've executed the cell it will take minute or so to read the Argo status information from the Internet (the PerformanceWarning can be ignored - for this small table it doesn't matter much).
Once the status information is read it is cached locally and further calls to get_oxy_floats_from_status() will execute much faster. To demonstrate, let's count all the oxygen labeled floats that have been out for at least 2 years.
In [4]:
%%time
floats730 = ad.get_oxy_floats_from_status(age_gte=730)
print('{} floats at least 730 days old'.format(len(floats730)))
Now let's find the Data Assembly Center URL for each of the floats in our list. (The returned dictionary of URLs is also locally cached.)
In [5]:
%%time
dac_urls = ad.get_dac_urls(floats340)
print(len(dac_urls))
Now, whenever we need to get profile data our lookups for status and Data Assembly Centers will be serviced from the local cache. Let's get a Pandas DataFrame (df) of 20 profiles from the float with WMO number 1900650.
In [6]:
%%time
wmo_list = ['1900650']
ad.set_verbosity(0)
df = ad.get_float_dataframe(wmo_list, max_profiles=20)
Profile data is also cached locally. To demonstrate, perform the same command as in the previous cell and note the time difference.
In [7]:
%%time
df = ad.get_float_dataframe(wmo_list, max_profiles=20)
Examine the first 5 records of the float data.
In [8]:
df.head()
Out[8]:
There's a lot that can be done with the profile data in this DataFrame structure. We can construct a time_range string and query for all the data values from less than 10 decibars:
In [9]:
time_range = '{} to {}'.format(df.index.get_level_values('time').min(),
df.index.get_level_values('time').max())
df.query('pressure < 10')
Out[9]:
In one command we can take the mean of all the values from the upper 10 decibars:
In [10]:
df.query('pressure < 10').groupby(level=['wmo', 'time']).mean()
Out[10]:
We can plot the profiles:
In [11]:
%pylab inline
import pylab as plt
# Parameter long_name and units copied from attributes in NetCDF files
parms = {'TEMP_ADJUSTED': 'SEA TEMPERATURE IN SITU ITS-90 SCALE (degree_Celsius)',
'PSAL_ADJUSTED': 'PRACTICAL SALINITY (psu)',
'DOXY_ADJUSTED': 'DISSOLVED OXYGEN (micromole/kg)'}
plt.rcParams['figure.figsize'] = (18.0, 8.0)
fig, ax = plt.subplots(1, len(parms), sharey=True)
ax[0].invert_yaxis()
ax[0].set_ylabel('SEA PRESSURE (decibar)')
for i, (p, label) in enumerate(parms.iteritems()):
ax[i].set_xlabel(label)
ax[i].plot(df[p], df.index.get_level_values('pressure'), '.')
plt.suptitle('Float(s) ' + ' '.join(wmo_list) + ' from ' + time_range)
Out[11]:
We can plot the location of these profiles on a map:
In [12]:
from mpl_toolkits.basemap import Basemap
m = Basemap(llcrnrlon=15, llcrnrlat=-90, urcrnrlon=390, urcrnrlat=90, projection='cyl')
m.fillcontinents(color='0.8')
m.scatter(df.index.get_level_values('lon'), df.index.get_level_values('lat'), latlon=True)
plt.title('Float(s) ' + ' '.join(wmo_list) + ' from ' + time_range)
Out[12]: