Most modern circulation models discretize the partial differential equations needed to simulate the earth system on a logically rectangular grid. This means the grid for a single time step can be represented as a 3-dimensional array of cells. Even for more complex grid geometries like here, subdomains are usually organized in this manner. A noteable exception are models with unstructured grids example, which currently cannot be processed with the datamodel of xarray and xgcm.
Our grid operators work on the logically rectangular grid of an ocean model, meaning that e.g. differences are evaluated on the 'neighboring' cells in either direction, but even though these cells are adjacent, cells can have different size and geometry.
In order to convert operators acting on the logically rectangular grid to physically meaningful output models need 'metrics' - information about the grid cell geometry in physical space. In the case of a perfectly rectangular cuboid, the only metrics needed would be three of the edge distances. All other distances can be reconstructed exactly. Most ocean models have however slightly distorted cells, due to the curvature of the earth. To accurately represent the volume of the cell we require more metrics.
Each grid point has three kinds of fundamental metrics associated with it which differ in the number of described axes:
('X',)
,('Y',)
or ('Z',)
). Each distance describes the distance from the point to either face of the cell associated with the grid point. ('X', 'Y')
, ('Y', 'Z')
and ('X', 'Z')
). Each grid point intersects three areas.('X', 'Y', 'Z')
).Once the user assigns the metrics (given as coordinates in most model output) to the grid
object, xgcm is able to automatically select and apply these to calculate e.g. derivatives and integrals from model data.
In [1]:
import xarray as xr
import numpy as np
from xgcm import Grid
import matplotlib.pyplot as plt
%matplotlib inline
In [2]:
# hack to make file name work with nbsphinx and binder
import os
fname = '../datasets/mitgcm_example_dataset_v2.nc'
if not os.path.exists(fname):
fname = '../' + fname
ds = xr.open_dataset(fname)
ds
Out[2]:
For mitgcm output we need to first incorporate partial cell thicknesses into new metric coordinates
In [3]:
ds['drW'] = ds.hFacW * ds.drF #vertical cell size at u point
ds['drS'] = ds.hFacS * ds.drF #vertical cell size at v point
ds['drC'] = ds.hFacC * ds.drF #vertical cell size at tracer point
To assign the metrics, the user has to provide a dictionary with keys and entries corresponding to the spatial orientation of the metrics and a list of the appropriate variable names in the dataset.
In [4]:
metrics = {
('X',): ['dxC', 'dxG'], # X distances
('Y',): ['dyC', 'dyG'], # Y distances
('Z',): ['drW', 'drS', 'drC'], # Z distances
('X', 'Y'): ['rA', 'rAz', 'rAs', 'rAw'] # Areas
}
grid = Grid(ds, metrics=metrics)
It is now possible to integrate over any grid axis. For example, we can integrate over the Z axis to compute the discretized version of:
$$ \int_{-H}^0 u dz$$in one line:
In [5]:
grid.integrate(ds.UVEL, 'Z').plot();
This is equivalent to doing (ds.UVEL * ds.drW).sum('Z')
, with the advantage of not having to remember the name of the appropriate metric and the matching dimension. The only thing the user needs to input is the axis to integrate over.
In [6]:
a = grid.integrate(ds.UVEL, 'Z')
b = (ds.UVEL * ds.drW).sum('Z')
xr.testing.assert_equal(a, b)
We can do the exact same thing on a tracer field (which is located on a different grid point) by using the exact same syntax:
In [7]:
grid.integrate(ds.SALT, 'Z').plot();
It also works in two dimensions:
In [8]:
a = grid.integrate(ds.UVEL, ['X', 'Y'])
a.plot(y='Z')
# Equivalent to integrating over area
b = (ds.UVEL * ds.rAw).sum(['XG', 'YC'])
xr.testing.assert_equal(a, b)
And finally in 3 dimensions, this time using the salinity of the tracer cell:
In [9]:
print('Spatial integral of zonal velocity: ',grid.integrate(ds.UVEL, ['X', 'Y', 'Z']).values)
But wait, we did not provide a cell volume when setting up the Grid
. What happened?
Whenever no matching metric is provided, xgcm will default to reconstruct it from the other available metrics, in this case the area and z distance of the tracer cell
In [10]:
a = grid.integrate(ds.SALT, ['X', 'Y', 'Z'])
b = (ds.SALT * ds.rA * ds.drC).sum(['XC', 'YC', 'Z'])
xr.testing.assert_allclose(a, b)
In [11]:
# depth mean salinity
grid.average(ds.SALT, ['Z']).plot();
Equivalently, this can be computed with the xgcm operations:
(ds.SALT * ds.drF).sum('Z') / ds.drF.sum('Z')
See also for zonal velocity:
In [12]:
# depth mean zonal velocity
grid.average(ds.UVEL, ['Z']).plot();
This works with multiple dimensions as well:
In [13]:
# horizontal average zonal velocity
grid.average(ds.UVEL, ['X','Y']).plot(y='Z');
In [14]:
# average salinity of the global ocean
# horizontal average zonal velocity
print('Volume weighted average of salinity: ',grid.average(ds.SALT, ['X','Y', 'Z']).values)
Using the metric-aware cumulative integration cumint
, we can calculate the barotropic transport streamfunction even easier and more intuitive in one line:
In [15]:
# the streamfunction is the cumulative integral of the vertically integrated zonal velocity along y
psi = grid.cumint(-grid.integrate(ds.UVEL,'Z'),'Y', boundary='fill')
maskZ = grid.interp(ds.hFacS, 'X').isel(Z=0)
(psi / 1e6).squeeze().where(maskZ).plot.contourf(levels=np.arange(-160, 40, 5));
Here, cumint
is performing the discretized form of:
where $U = \int_{-H}^0 u dz$, and under the hood looks like the following operation:
grid.cumsum( -grid.integrate(ds.UVEL,'Z') * ds.dyG, 'Y', boundary='fill')
Except that, once again, one does not have to remember the matching metric while using cumint
.
In a similar fashion to integration, xgcm uses metrics
to compute derivatives.
For this example we show vertical shear, i.e. the derivative of some quantity in the vertical.
At it's core, derivative
is based on diff
, which shifts a data array
to a new grid point, as shown
here.
Because of this shifting, we need to either define new metrics which live at the right points on the grid,
or first interpolate the desired quantities, anticipating the shift.
Here we choose the latter, and interpolate velocities and temperature onto the vertical cell faces of the grid
cells.
The resulting quantities are in line with the vertical velocity w
, which is shown in the vertical grid of the C
grid here.
In [16]:
uvel_l = grid.interp(ds.UVEL,'Z')
vvel_l = grid.interp(ds.VVEL,'Z')
theta_l = grid.interp(ds.THETA,'Z')
The subscript "l" is used to denote a leftward shift on the vertical axis, following this nomenclature.
As a first example, we show zonal velocity shear in the top layer, which is the finite difference version of:
$$ \frac{\partial u}{\partial z}\Big|_{z=-25m} $$
In [17]:
zonal_shear = grid.derivative(uvel_l,'Z')
zonal_shear.isel(Z=0).plot();
and the underlying xgcm operations are:
grid.diff( uvel_l, 'Z' ) / ds.drW
Which is shown to be equivalent below:
In [18]:
expected_result = (grid.diff( uvel_l, 'Z') ) /ds.drW
xr.testing.assert_equal(zonal_shear, expected_result.reset_coords(drop=True))
A note on dimensions: here we first interpolated from "Z"->"Zl" and the derivative operation shifted the result back from "Zl"->"Z".
In [19]:
print('1. ', ds.UVEL.dims)
print('2. ', uvel_l.dims)
print('3. ', zonal_shear.dims)
For reference, the vertical profiles of horizontal average of zonal velocity and zonal velocity shear are shown below.
In [20]:
fig,axs = plt.subplots(1,2,figsize=(12,8))
titles=['Horizontal average of zonal velocity, $u$',
'Horizontal average of zonal velocity shear, $\partial u/\partial z$']
for ax,fld,title in zip(axs,[ds.UVEL,zonal_shear],titles):
# Only select non-land (a.k.a. wet) points
fld = fld.where(ds.maskW).isel(time=0).copy()
grid.average(fld,['X','Y']).plot(ax=ax,y='Z')
ax.grid();
ax.set_title(title);
And finally, for meridional velocity and temperature in the top layer:
In [21]:
grid.derivative(vvel_l,'Z').isel(Z=0).plot();
In [22]:
grid.derivative(theta_l,'Z').isel(Z=0).plot();
In [23]:
ds['dxF'] = grid.interp(ds.dxC,'X')
ds['dyF'] = grid.interp(ds.dyC,'Y')
In [24]:
metrics = {
('X',): ['dxC', 'dxG','dxF'], # X distances
('Y',): ['dyC', 'dyG','dyF'], # Y distances
('Z',): ['drW', 'drS', 'drC'], # Z distances
('X', 'Y'): ['rA', 'rAz', 'rAs', 'rAw'] # Areas
}
grid = Grid(ds, metrics=metrics)
Here we show temperature interpolated in the X direction: from the tracer location to where zonal velocity is located, i.e. from t
to u
in the horizontal view of the C grid shown here.
In [25]:
grid.interp(ds.THETA.where(ds.maskC),'X',metric_weighted=['X','Y']).isel(Z=0).plot();
This area weighted interpolation conserves tracer content in the horizontal, at least to first order as defined by the underlying interpolation operation. Note that in this example the difference between an area weighted interpolation and standard interpolation (i.e. arithmetic mean for first order) is quite small because the underlying field is smooth.