In [ ]:
%matplotlib inline
%config InlineBackend.figure_format = "retina"
from matplotlib import rcParams
rcParams["savefig.dpi"] = 200
rcParams["font.size"] = 8
import warnings
warnings.filterwarnings("ignore")
In this tutorial we will show how to create 3D boolean masks for arbitrary latitude and longitude grids. It uses the same algorithm to determine if a gridpoint is in a region as for the 2D mask. However, it returns a xarray.Dataset
with shape region x lat x lon
, gridpoints that do not fall in a region are False
, the gridpoints that fall in a region are True
.
3D masks are convenient as they can be used to directly calculate weighted regional means (over all regions) using xarray v0.15.1 or later. Further, the mask includes the region names and abbreviations as non-dimension coordinates.
Import regionmask and check the version:
In [ ]:
import regionmask
regionmask.__version__
Load xarray and numpy:
In [ ]:
import xarray as xr
import numpy as np
In [ ]:
lon = np.arange(-179.5, 180)
lat = np.arange(-89.5, 90)
We will create a mask with the SREX regions (Seneviratne et al., 2012).
In [ ]:
regionmask.defined_regions.srex
The function mask_3D
determines which gripoints lie within the polygon making up each region:
In [ ]:
mask = regionmask.defined_regions.srex.mask_3D(lon, lat)
mask
As mentioned, mask
is a boolean xarray.Dataset
with shape region x lat x lon
. It contains region
(=numbers
) as dimension coordinate as well as abbrevs
and names
as non-dimension coordinates (see the xarray docs for the details on the terminology). The four first layers look as follows:
In [ ]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
from matplotlib import colors as mplc
cmap1 = mplc.ListedColormap(["none", "#9ecae1"])
fg = mask.isel(region=slice(4)).plot(
subplot_kws=dict(projection=ccrs.PlateCarree()),
col="region",
col_wrap=2,
transform=ccrs.PlateCarree(),
add_colorbar=False,
aspect=1.5,
cmap=cmap1,
)
for ax in fg.axes.flatten():
ax.coastlines()
fg.fig.subplots_adjust(hspace=0, wspace=0.1);
In [ ]:
airtemps = xr.tutorial.load_dataset("air_temperature")
The example data is a temperature field over North America. Let's plot the first time step:
In [ ]:
# choose a good projection for regional maps
proj = ccrs.LambertConformal(central_longitude=-100)
ax = plt.subplot(111, projection=proj)
airtemps.isel(time=1).air.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree())
ax.coastlines();
An xarray object can be passed to the mask_3D
function:
In [ ]:
mask_3D = regionmask.defined_regions.srex.mask_3D(airtemps)
mask_3D
Per default this creates a mask
containing one layer (slice) for each region containing (at least) one gridpoint. As the example data only has values over Northern America we only get only 6 layers even though there are 26 SREX regions. To obtain all layers specify drop=False
:
In [ ]:
mask_full = regionmask.defined_regions.srex.mask_3D(airtemps, drop=False)
mask_full
Note mask_full
now has 26 layers.
In [ ]:
# 1) by the index of the region:
r1 = mask_3D.sel(region=3)
# 2) with the abbreviation
r2 = mask_3D.isel(region=(mask_3D.abbrevs == "WNA"))
# 3) with the long name:
r3 = mask_3D.isel(region=(mask_3D.names == "E. North America"))
This also applies to the regionally-averaged data below.
It is currently not possible to use sel
with a non-dimension coordinate - to directly select abbrev
or name
you need to create a MultiIndex
:
In [ ]:
mask_3D.set_index(regions=["region", "abbrevs", "names"]);
In [ ]:
airtemps_cna = airtemps.where(r1)
Which looks as follows:
In [ ]:
proj = ccrs.LambertConformal(central_longitude=-100)
ax = plt.subplot(111, projection=proj)
airtemps_cna.isel(time=1).air.plot(ax=ax, transform=ccrs.PlateCarree())
ax.coastlines();
We could now use airtemps_cna
to calculate the regional average for 'Central North America'. However, there is a more elegant way.
Using the 3-dimensional mask it is possible to calculate weighted averages of all regions in one go, using the weighted
method (requires xarray 0.15.1 or later). As proxy of the grid cell area we use cos(lat)
.
In [ ]:
weights = np.cos(np.deg2rad(airtemps.lat))
ts_airtemps_regional = airtemps.weighted(mask_3D * weights).mean(dim=("lat", "lon"))
Let's break down what happens here. By multiplying mask_3D * weights
we get a DataArray where gridpoints not in the region get a weight of 0. Gridpoints within a region get a weight proportional to the gridcell area. airtemps.weighted(mask_3D * weights)
creates an xarray object which can be used for weighted operations. From this we calculate the weighted mean
over the lat and lon dimensions. The resulting dataarray has the dimensions region x time
:
In [ ]:
ts_airtemps_regional
The regionally-averaged time series can be plotted:
In [ ]:
ts_airtemps_regional.air.plot(col="region", col_wrap=3);
With this caveat in mind we can create the land-sea mask:
In [ ]:
land_110 = regionmask.defined_regions.natural_earth.land_110
land_mask = land_110.mask_3D(airtemps)
and plot it
In [ ]:
proj = ccrs.LambertConformal(central_longitude=-100)
ax = plt.subplot(111, projection=proj)
land_mask.squeeze().plot.pcolormesh(
ax=ax, transform=ccrs.PlateCarree(), cmap=cmap1, add_colorbar=False
)
ax.coastlines();
To create the combined mask we multiply the two:
In [ ]:
mask_lsm = mask_3D * land_mask.squeeze(drop=True)
Note the .squeeze(drop=True)
. This is required to remove the region
dimension from land_mask
.
Finally, we compare the original mask with the one restricted to land points:
In [ ]:
f, axes = plt.subplots(1, 2, subplot_kw=dict(projection=proj))
ax = axes[0]
mask_3D.sel(region=2).plot(
ax=ax, transform=ccrs.PlateCarree(), add_colorbar=False, cmap=cmap1
)
ax.coastlines()
ax.set_title("Regional mask: all points")
ax = axes[1]
mask_lsm.sel(region=2).plot(
ax=ax, transform=ccrs.PlateCarree(), add_colorbar=False, cmap=cmap1
)
ax.coastlines()
ax.set_title("Regional mask: land only");