The data can be obtained from Coriolis FTP at ftp://ftp1.ifremer.fr/Core/INSITU_GLO_TS_REP_OBSERVATIONS_013_001_b. As an illustration, we will work with the OA data for December 2013. With this dataset, the profiles are vertically interpolated on specific depth levels.
Let's assume the datafile is as follows:
In [16]:
datafile = "/home/ctroupin/DataOceano/Coriolis/CORA/NetCDF/OA_CORA4.1_20131215_dat_PSAL.nc"
We import modules to read the netCDF file and to perform basic operations on the data.
In [18]:
import cf
from netCDF4 import Dataset
import numpy as np
As most of the file content is compliant with the CF conventions, we can use the cf package to read the variable we need:
In [101]:
f = cf.read(datafile)
salinity = f.select('sea_water_salinity')[0].array
ftime = f.select('time')[0].array
lon = f.select('lon')[0].array
lat = f.select('lat')[0].array
depth = f.select('depth')[0].array
f.close()
We are just missing the variable that stores the quality flags. We will use the netCDF4 module.
In [20]:
with Dataset(datafile) as nc:
salinityQC = nc.variables['PSAL_QC'][:]
The salinity data are organised in profiles, each of them with a time, a longitude and a latitude. We can exmaine the dimension of the coordinates:
In [21]:
print str(len(lon)) + ' available profiles'
print 'on ' + str(len(depth)) + ' depth levels.'
We can take the data on a given depth level to perform basic operations and a plot. Let's try with a depth of 100 meters.
In [22]:
mydepth = 100.
We compute the index of the level corresponding to that depth.
In [23]:
depthindex = np.argmin(abs(depth - mydepth))
print 'depthindex = ' + str(depthindex)
Now we take the salinity and its corresponding flags at that level.
In [44]:
salinity_atdepth = salinity[:, depthindex]
salinity_atdepth_QC = salinityQC[:, depthindex]
Let's examine the shape of the variables:
In [45]:
salinity_atdepth.shape
Out[45]:
This would mean that we have 26693 measurements available at that depth. However, some values can be masked, because the profile doesn't reach that depth. So to compute the real number of observations, we can use count which is an in-place method of cf.
In [46]:
goodmeasurements = salinity_atdepth.count()
print str(goodmeasurements) + ' data points available'
A quality flag (QF) is attributed to each measurement. We can easily count the measurements for each QF, using a loop from 0 to 9.
In [47]:
for qf in np.arange(0, 10):
print 'QF=%s: %s profiles' % (qf, np.sum(salinity_atdepth_QC == qf))
In [28]:
salinity_atdepth.shape[0] - salinity_atdepth.count()
Out[28]:
We can check what the minimal and maximal salinity values are, and their positions. Later we will represent them on a plot.
In [48]:
minsalinity, maxsalinity = np.min(salinity_atdepth), np.max(salinity_atdepth)
minsalinityindex, maxsalinityindex = np.argmin(salinity_atdepth), np.argmax(salinity_atdepth)
lonmin, latmin = lon[minsalinityindex], lat[minsalinityindex]
lonmax, latmax = lon[maxsalinityindex], lat[maxsalinityindex]
print 'Minimum salinity = %s at %sE, %sN' % (minsalinity, lonmin, latmin)
print 'Maximum salinity = %s at %sE, %sN' % (maxsalinity, lonmax, latmax)
In [49]:
meansalinity = salinity_atdepth.mean()
stdsalinity = salinity_atdepth.std()
print 'Mean salinity = ' + str(meansalinity)
print 'Standard deviation = ' + str(stdsalinity)
In [71]:
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib as mpl
from matplotlib import colors
We will only consider the observations with QF=1. To do this, we could try selecting only the values of the array that satisfy the condition:
In [52]:
salinity_plot = salinity_atdepth[salinity_atdepth_QC == 1]
print salinity_plot.shape
print lon.shape
but then the problem is that the new array has a different size from the size of the coordinates. This is why we prefer to mask the bad values (QF not equal to 1).
In [130]:
salinity_atdepth_masked = np.ma.masked_where(salinity_atdepth_QC != 1, salinity_atdepth)
lon_masked = np.ma.masked_where(salinity_atdepth_QC != 1, lon)
lat_masked = np.ma.masked_where(salinity_atdepth_QC != 1, lat)
We set the fontsize to 20 pts.
In [131]:
mpl.rcParams.update({'font.size': 20})
We start with a simple histogram showing the distribution, using 0.2 unit bins.
In [132]:
fig = plt.figure(figsize=(8,6))
plt.hist(salinity_atdepth_masked, np.arange(32., 40., 0.2))
plt.xlabel('Salinity')
plt.ylabel('Number of data points')
plt.show()
We start by defining the colormap and the limits for the histogram.
In [141]:
cmap = plt.cm.get_cmap('RdYlBu_r')
Y, X = np.histogram(salinity_atdepth_masked, np.arange(32., 40., 0.2))
Then we make the plot.
In [142]:
x_span = X.max() - X.min()
C = [cmap(((x - X.min()) / x_span)) for x in X]
fig = plt.figure(figsize=(8, 6))
plt.bar(X[:-1], Y, color=C, width=X[1]-X[0])
plt.xlabel('Salinity')
plt.ylabel('Number of data points')
plt.show()
We keep the same colormap as for the histogram and we define the range of values for the scatter plot.
In [143]:
norm = colors.Normalize(vmin=32., vmax=39.)
We create a scatter plot with colors scaled 32 and 39 and indicate the positions of the minimum and maximum salinity (squares).
In [165]:
fig = plt.figure(figsize=(10,8))
scat = plt.scatter(lon, lat, s=10, c=salinity_atdepth_masked, edgecolor='None', cmap=cmap, norm=norm)
plt.plot(lonmax, latmax, 'rs', ms=10)
plt.plot(lonmin, latmin, 'bs', ms=10)
plt.colorbar(scat, extend='both')
plt.show()
We can do better by including coastline, scale and other labels.
We first need to import the Basemap module to plot data on a map, then the projection has to be defined. We select a Mollweide projection centered at 0º longitude, with a crude resolution for the coastline.
In [152]:
from mpl_toolkits.basemap import Basemap
m = Basemap(projection='moll', lon_0=0, resolution='c')
Before making the plot, it is necessary to transform the (lon, lat) coordinates into the new projection that we defined.
In [153]:
lon_p, lat_p = m(lon_masked, lat_masked)
The plot is created with the same command as before, except that we use m.scatter() instead of plt.scatter().
In addition we will:
Finally, the colorbar will be placed below the map.
In [157]:
fig = plt.figure(figsize=(10,8))
m.scatter(lon_p, lat_p, s=10, c=salinity_atdepth_masked, edgecolor='None', cmap=cmap, norm=norm)
plt.colorbar(scat, extend='both', orientation='horizontal', pad=0.05)
m.fillcontinents(color='gray', lake_color='white')
m.drawcoastlines(linewidth=0.5)
plt.title('Salinity at ' + str(mydepth) + ' meters \n' + str(goodmeasurements) + ' measurements')
plt.show()
Even with this type of scatter plot, we can see interesting characteristics of the salinity field.