In this exercise we will plot all the data locations available for a given day in the latest directory of the global product (INSITU_GLO_NRT_OBSERVATIONS_013_030) directory. We assume the data have been downloaded in the following directory:



In [1]:

    
datadir = "~/CMEMS_INSTAC/INSITU_GLO_NRT_OBSERVATIONS_013_030/latest/20151201/"



In [2]:

    
%matplotlib inline
import matplotlib.pyplot as plt
import glob
import os
import netCDF4
import numpy as np

File reading

We create a list of the files available in the directory:



In [3]:

    
datadir = os.path.expanduser(datadir)
filelist = sorted(glob.glob(datadir + '*nc'))
nfiles = len(filelist)
print("Number of files = %u" % (nfiles))









    



Number of files = 4839

Now we can loop on the files (just the first 10) to check if it's okay:



In [4]:

    
for datafiles in filelist[0:10]:
    print os.path.basename(datafiles)









    



AR_LATEST_PR_CT_MYO_AR_ITP_82_20151201.nc
AR_LATEST_PR_CT_MYO_AR_ITP_83_20151201.nc
AR_LATEST_PR_CT_MYO_AR_ITP_89_20151201.nc
AR_LATEST_PR_CT_MYO_AR_ITP_91_20151201.nc
AR_LATEST_PR_CT_MYO_AR_ITP_92_20151201.nc
AR_LATEST_PR_CT_MYO_AR_ITP_93_20151201.nc
AR_LATEST_TS_MO_MYO_AR_Blakksnes_20151201.nc
AR_LATEST_TS_MO_MYO_AR_Drangsnes_20151201.nc
AR_LATEST_TS_MO_MYO_AR_Gardskagadufl_20151201.nc
AR_LATEST_TS_MO_MYO_AR_Grimsey_20151201.nc

Basic plot

We read the coordinate variables from the file and plot them on the map.

Then we loop on the files:



In [5]:

    
lon = np.zeros(nfiles)
lat = np.zeros(nfiles)
for count, datafiles in enumerate(filelist):
    with netCDF4.Dataset(datafiles) as nc:
        lon[count] = np.nanmean(nc.variables['LONGITUDE'][:])
        lat[count] = np.nanmean(nc.variables['LATITUDE'][:])









    



/usr/local/lib/python2.7/dist-packages/numpy/lib/nanfunctions.py:675: RuntimeWarning: Mean of empty slice
  warnings.warn("Mean of empty slice", RuntimeWarning)
/usr/local/lib/python2.7/dist-packages/numpy/ma/core.py:4085: UserWarning: Warning: converting a masked element to nan.
  warnings.warn("Warning: converting a masked element to nan.")

We also mask the bad values of coordinates:



In [6]:

    
lon = np.ma.masked_outside(lon, -180., 360.)
lat = np.ma.masked_outside(lat, -90., 90.)









    



/usr/local/lib/python2.7/dist-packages/numpy/ma/core.py:2107: RuntimeWarning: invalid value encountered in less
  condition = (xf < v1) | (xf > v2)
/usr/local/lib/python2.7/dist-packages/numpy/ma/core.py:2107: RuntimeWarning: invalid value encountered in greater
  condition = (xf < v1) | (xf > v2)



In [7]:

    
fig = plt.figure(figsize=(8, 8))
plt.plot(lon, lat, 'ko', markersize=2)
plt.show()

Improved plot

A projection is created, so we have access to a coastline, land mask, ...



In [8]:

    
from mpl_toolkits.basemap import Basemap
m = Basemap(projection='moll', lon_0=0, resolution='c')
lon_p, lat_p = m(lon, lat)



In [9]:

    
import matplotlib
font = {'family' : 'serif',
        'size'   : 16}

matplotlib.rc('font', **font)



In [10]:

    
fig = plt.figure(figsize=(10,8))
m.plot(lon_p, lat_p, 'ko', markersize=2)
m.drawparallels(np.arange(-80, 80., 20.),labels=[True,False,False,True], zorder=2)
m.drawmeridians(np.arange(-180, 180., 30.), zorder=2)
m.fillcontinents(color='gray', zorder=3)
m.drawcoastlines(linewidth=0.5)
plt.title('Platform locations on\n December 1st, 2015', fontsize=20)

plt.show()









    



/usr/local/lib/python2.7/dist-packages/matplotlib/collections.py:590: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  if self._edgecolors == str('face'):

It is interesting to see the highly variable coverage.
For example near the Equator in the Atlantic Ocean, the coverage is very low.

Plot by platform

The idea is to have a different color depending on the platform type:

drifting buoys,
gliders,
moorings... To do so, we will create a list of files for each parameter.
There is an extensive description of all the data types in this document.



In [11]:

    
datatypelist = ('BA', 'BO', 'CT', 'DB', 'DC', 'FB', 'GL', 'MO', 'ML', 'PF', 'RF', 'TE', 'TS', 'XB')
for datatype in datatypelist:
    filelist = sorted(glob.glob(datadir + '*LATEST_*_' + datatype + '*nc'))
    nfiles = len(filelist)
    print("Number of files of type %s = %u" % (datatype, nfiles))









    



Number of files of type BA = 6
Number of files of type BO = 1
Number of files of type CT = 6
Number of files of type DB = 1417
Number of files of type DC = 1032
Number of files of type FB = 6
Number of files of type GL = 4
Number of files of type MO = 700
Number of files of type ML = 2
Number of files of type PF = 1346
Number of files of type RF = 89
Number of files of type TE = 218
Number of files of type TS = 14
Number of files of type XB = 1

We see that some data types are not often present in the list, so we will only take those with at least 50 data files.
We create a dictionnary with the abbreviation and the corresponding data types:



In [12]:

    
datatypename = {'DB': 'Drifting buoys', 'DC': 'Drifting buoy reporting calculated sea water current',
                'MO': 'Fixed buoys or mooring time series', 'PF': 'Profiling floats vertical profiles',
                'RF': 'River flows', 'TE': 'TESAC messages on GTS'}
datatypename.keys()









    Out[12]:





['MO', 'DB', 'DC', 'RF', 'PF', 'TE']

We create a list of colors for the plot:



In [19]:

    
colorlist = ['red', 'yellowgreen', 'lightskyblue', 'gold', 'violet', 'lightgray']



In [14]:

    
m = Basemap(projection='robin', lat_0=0, lon_0=0., resolution='l')



In [20]:

    
fig = plt.figure(figsize=(10,8))
ax = plt.subplot(111)
nfiles = np.zeros(len(datatypename.keys()))
for ntype, datatype in enumerate(datatypename.keys()):
    filelist = sorted(glob.glob(datadir + '*LATEST_*_' + datatype + '*nc'))
    nfiles[ntype] = len(filelist)
    lon = np.zeros(nfiles[ntype])
    lat = np.zeros(nfiles[ntype])
    for count, datafiles in enumerate(filelist):
        with netCDF4.Dataset(datafiles) as nc:
            lon[count] = np.nanmean(nc.variables['LONGITUDE'][:])
            lat[count] = np.nanmean(nc.variables['LATITUDE'][:])
    lon_p, lat_p = m(lon, lat)
    m.plot(lon_p, lat_p, 'o', markerfacecolor=colorlist[ntype], markeredgecolor=colorlist[ntype],
           markersize=3, label=datatypename[datatype] + ': ' + str(int(nfiles[ntype])) + ' files')
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.8, box.height])
ax.legend(loc='center left', bbox_to_anchor=(1, 0.5),fontsize=14)

m.drawparallels(np.arange(-80, 80., 30.),labels=[True,False,False,True], zorder=2)
m.drawmeridians(np.arange(-180, 180., 90.),labels=[True,False,False,True], zorder=2)
m.fillcontinents(color='gray', zorder=3)
m.drawcoastlines(linewidth=0.5)
plt.title('Platform locations on\n December 1st, 2015', fontsize=20)
plt.savefig('./figures/platform_types_20151201.png', dpi=300)
plt.show()

Pie chart

It is also useful to have a pie chart showing the relative importance of each data type.



In [21]:

    
fig2 = plt.figure(figsize=(10, 8)) 
plt.pie(nfiles, labels=datatypename.values(), colors=colorlist,
        autopct='%1.1f%%', startangle=90)
plt.savefig('./figures/platform_piechart_20151201.png', dpi=300)
plt.show()



In [ ]: