Explore locally cached Argo oxygen float data - second in a series of Notebooks

Use the oxyfloat module to get data and Pandas to operate on it for testing ability to easily perform calibrations

(See build_oxyfloat_cache.ipynb for for the work that leads to this Notebook.)

Add parent directory to the path and get an ArgoData object that uses the default local cache.


In [1]:
import sys
sys.path.insert(0, '../')

from oxyfloat import ArgoData
ad = ArgoData()

Get the default list of floats that have oxygen data.


In [2]:
wmo_list = ad.get_oxy_floats_from_status()

We can explore the distribution of AGEs of the Argo floats by getting the status data in a DataFrame (sdf).


In [3]:
sdf = ad._get_df(ad._STATUS)

Define a function (dist_plot) and plot the distribution of the AGE column.


In [4]:
%pylab inline
def dist_plot(df, title):
    from datetime import date
    ax = df.hist(bins=100)
    ax.set_xlabel('AGE (days)')
    ax.set_ylabel('Count')
    ax.set_title('{} as of {}'.format(title, date.today()))
    
dist_plot(sdf['AGE'], 'Argo float AGE distribution')


Populating the interactive namespace from numpy and matplotlib

There are over 600 floats with an AGE of 0. The .get_oxy_floats_from_status() method does not select these floats as I believe they are 'inactive'. Let's count the number of non-greylisted oxygen floats at various AGEs so that we can build a reasonably sized test cache.


In [5]:
sdfq = sdf.query('(AGE != 0) & (OXYGEN == 1) & (GREYLIST != 1)')
dist_plot(sdfq['AGE'], title='Argo oxygen float AGE distribution')
print 'Count age_gte 0340:', len(sdfq.query('AGE >= 340'))
print 'Count age_gte 1000:', len(sdfq.query('AGE >= 1000'))
print 'Count age_gte 2000:', len(sdfq.query('AGE >= 2000'))
print 'Count age_gte 2200:', len(sdfq.query('AGE >= 2200'))
print 'Count age_gte 3000:', len(sdfq.query('AGE >= 3000'))


Count age_gte 0340: 563
Count age_gte 1000: 314
Count age_gte 2000: 52
Count age_gte 2200: 19
Count age_gte 3000: 1

Compare the 2200 count with what .get_oxy_floats_from_status(age_gte=2200) returns.


In [6]:
len(ad.get_oxy_floats_from_status(age_gte=2200))


Out[6]:
19

That's reassuring! Now, let's build a custom cache file with the the 19 floats that have an AGE >= 2200 days.

From a shell window execute this script:

scripts/load_cache.py --age 2200 --profiles 2 -v

This will take several minutes to download the data and build the cache. Once it's finished you can execute the cells below (you will need to enter the exact name of the cache_file which the above command displays in its INFO messages).


In [7]:
%%time
ad = ArgoData(cache_file='../oxyfloat/oxyfloat_fixed_cache_age2200_profiles2.hdf')
wmo_list = ad.get_oxy_floats_from_status(2200)
df = ad.get_float_dataframe(wmo_list, max_profiles=2)


CPU times: user 3.39 s, sys: 159 ms, total: 3.55 s
Wall time: 1min 5s

Plot the profiles.


In [8]:
# Parameter long_name and units copied from attributes in NetCDF files
time_range = '{} to {}'.format(df.index.get_level_values('time').min(), 
                               df.index.get_level_values('time').max())
parms = {'TEMP_ADJUSTED': 'SEA TEMPERATURE IN SITU ITS-90 SCALE (degree_Celsius)', 
         'PSAL_ADJUSTED': 'PRACTICAL SALINITY (psu)',
         'DOXY_ADJUSTED': 'DISSOLVED OXYGEN (micromole/kg)'}

plt.rcParams['figure.figsize'] = (18.0, 8.0)
fig, ax = plt.subplots(1, len(parms), sharey=True)
ax[0].invert_yaxis()
ax[0].set_ylabel('SEA PRESSURE (decibar)')

for i, (p, label) in enumerate(parms.iteritems()):
    ax[i].set_xlabel(label)
    ax[i].plot(df[p], df.index.get_level_values('pressure'), '.')
    
plt.suptitle('Float(s) ' + ' '.join(wmo_list) + ' from ' + time_range)


Out[8]:
<matplotlib.text.Text at 0x7f8d9693de90>

Plot the profiles on a map.


In [9]:
import pylab as plt
from mpl_toolkits.basemap import Basemap

plt.rcParams['figure.figsize'] = (18.0, 8.0)
m = Basemap(llcrnrlon=15, llcrnrlat=-90, urcrnrlon=390, urcrnrlat=90, projection='cyl')
m.fillcontinents(color='0.8')

m.scatter(df.index.get_level_values('lon'), df.index.get_level_values('lat'), latlon=True)


Out[9]:
<matplotlib.collections.PathCollection at 0x7f8d96510390>
/home/mccann/VirtualEnvs/oxyfloat/lib/python2.7/site-packages/matplotlib/collections.py:590: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  if self._edgecolors == str('face'):