This notebook allows to extract, from GEOS-Chem outputs, profiles of given chemical species above a station and then calculate the total columns. The purpose is to compare these extracted profiles with the profiles measured above a given station.
A proper comparison should account for the spatial domains of both model outputs and observations, which would require regridding operations. Since we want to compare partial and/or total columns, it is important that the used regridding scheme is conservative (both locally and globally over the entire vertical profile) so that the total mass of the tracer is preserved.
Issues
Observations made by sun tracking at ground stations have a time-dependent spatial domain orientation and observed profiles are generally not vertical.
GEOS-Chem uses vertical grids (GEOS5) defined by hybrid coordinates. Elevation coordinates are time-dependent and are unknown unless we provide external data (i.e., elevation of the base levels of the GEOS-Chem vertical grids).
The overall extent of the model's vertical grid may differ from the extent of the vertical grid used for the observations. This would lead to undefined values if the extent of the latter grid is greater than the extent of the model's grid.
Strategy
Model outputs (source grid) are regridded onto an observation-compliant, destination grid.
In a conservative 3D regridding scheme, interpolation weights would be based on the true volumes of overlapping grid cells (see below). For vertical regridding, we can use only the height of the grid cells.
Regridded fields (tracer mixing-ratio) may have undefined values, for example for cells of the destination grid that don't overlap with the model source grid. For grid-cells that partially overlap the model grid, we apply a "mask tolerance", i.e., a relative overlapping volume threshold below which the value of the grid-cell will be set as undefined. This may introduce conservation errors, but since partially overlapping cells is likely to occur only at the top level of the model vertical grid, these errors can be neglected for species that usually have a low mixing-ratio at that level.
Notes
In [1]:
%matplotlib inline
import os
import string
import numpy as np
import pandas as pd
import plotly
import matplotlib.pyplot as plt
import iris
import iris.pandas as ipandas
import iris.quickplot as iqplot
import pygchem.grid as gcgrid
In [2]:
# ### INPUT FILES (GEOS-Chem outputs) ###
in_dir = "/home/mahieu/geos.runs/run_2x2.5dc/ts_files/"
# input main directory, which should contain GEOS-Chem ouput datafields.
in_files = ["ts.joch.200401*",]
# a list of one or several GEOS-Chem datafields file(s).
#
# Each list item can be
# (1) a the name of a file present in `in_dir` or
# (2) an absolute path to a file or
# (3) any file-matching pattern using de wildcard character.
#
# Mixing CTM outputs and ND49 outputs (time series) may work, but
# datafields must not overlap in time.
# All datafields contained in the files must use the same horizontal
# grid (or a subset of this grid)!
# ### OUTPUT FILES ###
out_dir = "/home/bovy/temp/"
# path to save output files where extracted data will be written
out_profiles_basename = "*_profiles_2004_01"
# basename of output file for profiles
# (any wildcard "*" will be replaced by `station_name`)
out_columns_basename = "*_columns_2004_01"
# basename of output file for columns
out_format = "xlsx" # format of output file ("csv", "hdf5", "xls", "xlsx")
# ### TRACERS AND DIAGNOSTICS ###
tracers = ["PAN", "CO", "ACET", "C3H8", "CH2O", "C2H6", "NH3"]
# a list of tracers/diagnostics for which profiles and columns
# will be extracted/computed
categories = ["IJ-AVG-$",]
# a list of diagnostic categories (should be "IJ-AVG-$" for tracers)
other_fields = ['PSURF_PEDGE-$', 'BXHEIGHT_BXHGHT-$', 'AIRDEN_TIME-SER', 'N(AIR)_BXHGHT-$', 'TMPU_DAO-3D-$']
# additional fields names to load (format: 'diagnostic_category')
#
# Must at least include datafields required for columns calculation,
# i.e., 'PSURF_PEDGE-$', 'BXHEIGHT_BXHGHT-$', 'AIRDEN_TIME-SER',
# 'N(AIR)_BXHGHT-$', 'TMPU_DAO-3D-$'
# ### STATION ###
station_name = "JungfrauJoch" # name of the station
station_lat = 46.54806 # lat [degrees_north]
station_lon = 7.98389 # lon [degress_east]
station_pressure = 659.3 # default, fixed air pressure at the station [hPa],
# used for columns calculation using pressures.
station_altitude = 3580. # elevation a.s.l at the station [meters],
# used for columns calculation using altitudes.
station_vertical_grid_file = '/home/bovy/Grids/NDACC_vertical_Jungfraujoch_39L_2x2.5.nc'
# path to the file (CF-netCDF) that contains the altitude values
# of the vertical grid on which data will be regridded.
# ### SUPPLY ADDITIONAL INFO AND DATA ###
grid_model_name = 'GEOS57_47L' # grid model name (see below for common names)
# all GEOS-Chem ouputs specified above must use this grid
grid_model_resolution = 2.5, 2 # grid horizontal resolution (lon, lat) [degrees]
# all GEOS-Chem ouputs specified above must use this resolution
imin, imax = 76, 77 # (min, max) grid indices of the 3D region box of interest
jmin, jmax = 69, 70 # i: longitude, j: latitude, l: vertical levels
lmin, lmax = 1, 47 #
# Must match the extent that was defined for any ND49
# diagnostic output specified in `in_files`.
# Must emcompass the position of the station (see below).
#
# Used either to define the coordinates of ND49 outputs or to
# extract a subset from the global CTM datafields.
global_topography_datafile = '/home/bovy/Grids/dem_GEOS57_2x2.5_awm.nc'
# path to the file of global topography needed for resampling
# the tracer profiles on a vertical grid with fixed altitude values.
# The global topography grid must be compatible with the
# GEOS-Chem grid used by the output GEOS-Chem files.
in_dir
(UNIX only)
In [3]:
%cd $in_dir
!ls *.{nc,bpch} -all -h
In [4]:
gcgrid.CTMGrid.models
Out[4]:
In [5]:
in_abspaths = [os.path.join(in_dir, fname) if not os.path.isabs(fname) else fname
for fname in in_files]
The vertical regridding scheme implemented here is a simplified, 1-dimensional version of the scheme described below.
Conservative regridding scheme (3D spherical grids)
A regridding scheme can be written as follows:
$$\Phi_{dj} = \sum_i p_{ij} \Phi_{si}$$where $\Phi$ is the field to be regridded (extensive quantity), $\Phi_{si}$ is the constant value of the field of in the $i^{th}$ cell of the source grid, $\Phi_{dj}$ is the constant value of the field of in the $j^{th}$ cell of the destination grid, and $p_{ij}$ are the coefficients of the sparse interpolation matrix.
For the conservative regridding scheme used here, $p_{ij}$ is given by:
$$p_{ij} = \frac{V_{si} \cap V_{dj}}{V_{si}}$$where $V_{si}$ and $V_{dj}$ are the volumes of the $i^{th}$ cell of the source grid and the $j^{th}$ cell of the destination grid, repectively.
Note that the expression above is only valid for extensive quantities, i.e., quantities whose value changes with the grid cell size (such as number of molecules). For intensive quantities (i.e., grid-size independent quantities such as mixing-ratio) we have:
$$\Phi_{si} = F_{si} V_{si}$$$$F_{dj} = \sum_i f_{ij} F_{si}$$$$p_{ij} = f_{ij} \frac{V_{dj}}{V_{si}} \Leftrightarrow f_{ij} = \frac{V_{si} \cap V_{dj}}{V_{dj}}$$where $F$ is the field to be regridded (intensive quantity, counterpart to $\Phi$).
Since the GEOS-Chem model's grid is defined by a geographic spherical coordinate system, and since we assume that the observational grid has the same system, the volumes of the grid cells can be calculated exactly from their corner coodinates. The true volume $V^T_i$ of a grid cell $i$ is given by:
$$V^T_i = \int_{\theta_{i,0}}^{\theta_{i,1}}\int_{\phi_{i,0}}^{\phi_{i,1}}\int_{\rho_{i,0}}^{\rho_{i,1}} \rho^{2} \sin{\left (\phi \right )}\, d\rho\, d\phi\, d\theta = \frac{1}{3} \left(\theta_{i,0} - \theta_{i,1}\right) \left(\rho_{i,0}^{3} \cos{\left (\phi_{i,0} \right )} - \rho_{i,0}^{3} \cos{\left (\phi_{i,1} \right )} - \rho_{i,1}^{3} \cos{\left (\phi_{i,0} \right )} + \rho_{i,1}^{3} \cos{\left (\phi_{i,1} \right )}\right)$$where ($\phi_{i,0}$, $\phi_{i,1}$), ($\theta_{i,0}$, $\theta_{i,1}$) and ($\rho_{i,0}$, $\rho_{i,1}$) are latitude, longitude and elevation (+ earth radius) bounds of the grid cell $i$, respectively.
Notes about the implementation
In [6]:
def _compute_exchange_vertical(src_z_coord, dst_z_coord):
"""
Compute the exchange vertical grid needed for
vertical regridding.
The cells of the exchange grid are defined by the
intersections of the cells of the source grid and the
the cells of the destination grid.
Returns a :class:`iris.coords.DimCoord` object.
"""
temp = np.concatenate((src_z_coord.bounds,
dst_z_coord.bounds))
temp = np.sort(temp, kind='mergesort', axis=None)
temp = np.unique(temp)
exchange_z_bounds = np.column_stack((temp[:-1], temp[1:]))
exchange_z_points = (exchange_z_bounds[:, 0] + exchange_z_bounds[:, 1]) / 2.
exchange_z_coord = iris.coords.DimCoord(exchange_z_points,
bounds=exchange_z_bounds,
standard_name=src_z_coord.standard_name,
units=src_z_coord.units)
return exchange_z_coord
def _map_exchange_vertical(src_or_dst_data,
src_or_dst_z_coord,
exchange_z_coord):
"""
Map on the exchange vertical grid data values given
on the source (or the destination) vertical grid.
Returns only the data (numpy array).
If a cell of the exchange grid doesn't overlap any
of the source (destination grid), value of that cell
is set to `np.nan`.
"""
points = exchange_z_coord.points
bounds = src_or_dst_z_coord.bounds
# 2-d boolean array (rows are exchange cells
# and cols are source/destination grid cells)
# True if exchange cell match with src or dst cell
is_in_cell = ((points[:,np.newaxis] > bounds[:,0]) &
(points[:,np.newaxis] < bounds[:,1]))
# a cell of the exchange should have only one
# corresponding cell in the source/destination grid
if is_in_cell.sum(axis=1).max() > 1:
raise ValueError("source/destination grid is not "
"compatible with the exchange grid")
# indices map between exchange grid (1st array)
# and source or destination grid (2st array)
map_indices = np.nonzero(is_in_cell)
# assign data
map_data = np.empty_like(points) * np.nan
map_data[map_indices[0]] = src_or_dst_data[map_indices[1]]
return map_data
def regrid_conservative_vertical(src_cube, dst_grid_cube,
src_data_type='intensive',
z_coord_name='altitude',
overlap_tol=0.95):
"""
Conservative regridding of a vertical profile.
Parameters
----------
src_cube : :class:`iris.cube.Cube`
the original profile to be regridded
dst_grid_cube : :class:`iris.cube.Cube`
defines the destination grid
src_data_type : ('intensive', 'extensive')
specifies whether the data of `src_cube`
is an intensive field (i.e., quantities
whose value changes with the grid cell size)
or an extensive field (i.e., grid-size
independent quantities).
z_coord_name : string
name of the vertical coordinate to use
for regridding (must be present in both
`src_cube` and `dst_grid_cube`).
overlap_top : float
cell overlap tolerance, i.e., a threshold
of - relative [0, 1] - overlapping height
between a cell of the destination grid
and cells of the source grid, under which
the value of the destination cell is set
to undefined.
Returns
-------
A new :class:`iris.cube.Cube` object
a copy of `dst_grid_cube` with
the regridded data of `src_cube`.
Auxilliary / scalar coordinates and
attributes of `src_cube` will be copied.
Notes
-----
Both `src_cube` and `dst_grid_cube` must be
1-dimensional and must have compatible
vertical coordinates, i.e., 1-dimensional,
with the same name `z_coord_name`, the same
coordinate system, the same units
and contiguous bounds.
"""
# get or compute vertical coordinates of the source, destination
# and exchange grids
# TODO: check if vertical coordinates of src_cube and dst_cube
# exist, are 1-dimensional and are compatibles (units,
# coord system, contiguous bounds)
src_z_coord = src_cube.coord(name=z_coord_name)
dst_z_coord = dst_grid_cube.coord(name=z_coord_name)
exchange_z_coord = _compute_exchange_vertical(src_z_coord,
dst_z_coord)
# compute cell heights of each grid
src_heights = src_z_coord.bounds[:,1] - src_z_coord.bounds[:,0]
dst_heights = dst_z_coord.bounds[:,1] - dst_z_coord.bounds[:,0]
exchange_heights = exchange_z_coord.bounds[:,1] - exchange_z_coord.bounds[:,0]
# generate arbitray levels for the destination grid
# (used for further aggregation - summing - of the exchange grid
# cells onto the destination grid
dst_levels = np.arange(1, dst_z_coord.points.size + 1)
# map source data values, source heights and destination heights
# (and destination levels) on the exchange grid
src_heights_mapped = _map_exchange_vertical(src_heights,
src_z_coord,
exchange_z_coord)
dst_heights_mapped = _map_exchange_vertical(dst_heights,
dst_z_coord,
exchange_z_coord)
dst_levels_mapped = _map_exchange_vertical(dst_levels,
dst_z_coord,
exchange_z_coord)
src_data_mapped = _map_exchange_vertical(src_cube.data,
src_z_coord,
exchange_z_coord)
# compute interpolation weights and data values
# on the exchange grid
if src_data_type == 'intensive':
exchange_iweights = exchange_heights / dst_heights_mapped
elif src_data_type == 'extensive':
exchange_iweights = exchange_heights / src_heights_mapped
else:
raise ValueError("invalid source data type: {}"
.format(src_data_type))
exchange_data = exchange_iweights * src_data_mapped
# finally get data values on the destination grid
# (sum data of overlapping cells of the exchange grid
# and apply the overlap tolerance)
levels_mask = dst_levels[:,np.newaxis] == dst_levels_mapped[np.newaxis,:]
dst_data = np.nansum((exchange_data[np.newaxis,:] * levels_mask),
axis=1)
valid_heights = np.where(np.isnan(exchange_data), np.nan, exchange_heights)
overlap_heights = np.nansum((valid_heights[np.newaxis,:] * levels_mask),
axis=1)
overlap_heights_relative = overlap_heights / dst_heights
nan_indices = np.nonzero(overlap_heights_relative < overlap_tol)
dst_data[nan_indices] = np.nan
# construct the cube of the regridded field
# copy all scalar coordinates from the source cube
dst_cube = dst_grid_cube.copy()
dst_cube.data = np.array(dst_data)
dst_cube.var_name = "_".join([src_cube.name(), "regridded"])
dst_cube.long_name = "_".join([src_cube.name(), "regridded"])
dst_cube.attributes.update(src_cube.attributes)
dst_cube.attributes.update({'src_cube' : src_cube.name(),
'grid_cube' : dst_cube.name(),
'regrid_method' : 'first-order conservative'})
dst_cube.units = src_cube.units
for c in src_cube.coords():
if c.points.size == 1 and not dst_cube.coords(coord=c):
dst_cube.add_aux_coord(c)
return dst_cube
In [7]:
bnds = np.column_stack((np.arange(0., 1002., 2)[:-1],
np.arange(0., 1002., 2)[1:]))
points = np.arange(1., 1000., 2)
src_alt_coord = iris.coords.DimCoord(points,
bounds=bnds,
standard_name='altitude',
units='m')
src_cube = iris.cube.Cube(np.ones(points.size) * points**2,
dim_coords_and_dims=[(src_alt_coord, 0)])
bnds = np.column_stack((np.arange(0., 1002., 8.)[:-1],
np.arange(0., 1002., 8.)[1:]))
points = np.arange(2., 1000., 8.)
dst_alt_coord = iris.coords.DimCoord(points,
bounds=bnds,
standard_name='altitude',
units='m')
dst_grid_cube = iris.cube.Cube(np.empty(points.size),
dim_coords_and_dims=[(dst_alt_coord, 0)])
result = regrid_conservative_vertical(src_cube, dst_grid_cube)
iqplot.plot(src_cube, src_cube.coord('altitude'))
iqplot.plot(result, result.coord('altitude'), '-+')
# check total mass conservation (should return around 1.0 with a very small error)
print (result.data[:] * 8.).sum() / (src_cube.data[:] * 2.).sum()
In [8]:
%timeit regrid_conservative_vertical(src_cube, dst_grid_cube)
In [9]:
def gcgrid_2_coords(model_name, model_resolution,
region_box=None):
"""
Get the X,Y,Z coordinates of the GEOS-Chem grid given
by `model_name` and `model_resolution`.
Parameters
----------
model_name : string
name of a GEOS-Chem grid model supported by pygchem
model_resolution : (float, float)
horizontal grid resolution (lon, lat), in degrees
region_box: (int, int, int, int, int, int) or None
grid indices of the 3D region box of interest
(imin, imax, jmin, jmax, lmin, lmax).
i: longitude, j: latitude, l: vertical levels
Returns
-------
i_coord, j_coord, l_coord : :class:`iris.coords.DimCoord`
Iris dimensional coordinates objects
(longitude, latitude, model_level_number)
"""
g = gcgrid.CTMGrid.from_model(model_name,
resolution=model_resolution)
lon_points, lat_points = g.lonlat_centers
elon, elat = g.lonlat_edges
lon_bounds = np.column_stack((elon[0:-1], elon[1:]))
lat_bounds = np.column_stack((elat[0:-1], elat[1:]))
levels = np.arange(1, g.Nlayers + 1)
if region_box is not None:
imin, imax, jmin, jmax, lmin, lmax = region_box
lon_points = lon_points[imin-1:imax]
lon_bounds = lon_bounds[imin-1:imax]
lat_points = lat_points[jmin-1:jmax]
lat_bounds = lat_bounds[jmin-1:jmax]
levels = np.arange(lmin, lmax + 1)
spherical_geocs = iris.coord_systems.GeogCS(
iris.analysis.cartography.DEFAULT_SPHERICAL_EARTH_RADIUS
)
i_coord = iris.coords.DimCoord(lon_points,
bounds=lon_bounds,
standard_name="longitude",
var_name="longitude",
units="degrees_east",
coord_system=spherical_geocs)
j_coord = iris.coords.DimCoord(lat_points,
bounds=lat_bounds,
standard_name="latitude",
var_name="latitude",
units="degrees_north",
coord_system=spherical_geocs)
l_coord = iris.coords.DimCoord(levels,
standard_name="model_level_number",
var_name="model_level_number",
units="1")
return i_coord, j_coord, l_coord
def fix_nd49_pressure(pedges_cube):
"""
GEOS-Chem Bug ?? PSURF_PEDGE-$ is missing one vertical level (e.g., using
the GEOS5_47L model, it has 47 levels but should have 48 levels).
Workaround: Add one level on top of the grid and assign low, arbitrary
pressure values at that level.
`pedges` must have 4-dimensions
(time, longitude, latitude, model_level_number)
Returns a new cube with one additional top level.
Procedure:
- extract the top level of the pressure cube (e.g., level 47 for GEOS5_47L)
- duplicate the cube
- replace the vertical coordinate value for the duplicated cube
(e.g., assign 48 for GEOS5_47L)
- assign new, arbitrary low pressures to the duplicated cube
(e.g., press_at_level_48 = 0.1 * pressure_at_level_47)
- re-assemble the cubes - split/slices and re-merge - to obtain a
new cube with 48 levels
"""
pedges_cube_slices = list(pedges_cube.slices(['time',
'longitude',
'latitude']))
pedges_cube_top = pedges_cube_slices[-1]
pedges_cube_overtop = pedges_cube_top.copy()
l_coord = iris.coords.DimCoord(np.array([48]),
standard_name='model_level_number')
pedges_cube_overtop.replace_coord(l_coord)
low_pressures = pedges_cube_top.data * 0.1
pedges_cube_overtop.data = low_pressures
pedges_cube_slices.append(pedges_cube_overtop)
# works only with both concatenate and merge
# (iris issue ? normally only merge is needed!)
pedges_cube_new = iris.cube.CubeList(pedges_cube_slices).concatenate().merge()[0]
pedges_cube_new.transpose(new_order=[1, 2, 3, 0])
return pedges_cube_new
def assign_coord_nd49_or_subset_ctm(cube, field, filename):
"""
A callback for GEOS-Chem datafields loading with Iris.
If `cube` is loaded from a ND49 diagnostic file (i.e., some
undefined dimension coordinates), generate the coordinates values
from the grid indices of the 3D region box.
(Else) If `cube` is loaded from a CTM file, extract a subset
that correspond to the region box.
"""
global i_coord, j_coord, l_coord
cube_is_ctm = True
if not cube.coords('longitude'):
cube.add_dim_coord(i_coord, 0)
cube_is_ctm = False
if not cube.coords('latitude'):
cube.add_dim_coord(j_coord, 1)
cube_is_ctm = False
if not cube.coords('model_level_number'):
cube.add_dim_coord(l_coord, 2)
if cube_is_ctm:
lonlat_subset = iris.Constraint(longitude=i_coord.points,
latitude=j_coord.points)
cube = cube.extract(lonlat_subset)
def ppbC_2_ppbv(cube):
"""
Convert to ppbv units for hydrocarbon tracers that
have ppbC units.
ppbC = parts per billion carbon
= ppbv * number of carbon atoms in the tracer molecule
"""
is_ppbC = cube.attributes.get('no_udunits2') == 'ppbC'
is_hydrocarbon = cube.attributes.get('hydrocarbon')
if is_hydrocarbon and is_ppbC:
carbon_weight = cube.attributes.get('carbon_weight')
cube.data = cube.data / carbon_weight
cube.units = 'ppbv'
In [10]:
station_region_indices = (imin, imax, jmin, jmax, lmin, lmax)
i_coord, j_coord, l_coord = gcgrid_2_coords(grid_model_name,
grid_model_resolution,
region_box=station_region_indices)
tracers2load = iris.AttributeConstraint(category=lambda category: category in categories,
name=lambda name: name in tracers)
all_cubes = iris.load(in_abspaths,
[tracers2load] + other_fields,
callback=assign_coord_nd49_or_subset_ctm)
tracer_cubes = all_cubes.extract(tracers2load)
other_cubes = all_cubes.extract(other_fields, strict=False)
# datafields required for columns and profiles calculation
pedges_cube = other_cubes.extract_strict('PSURF_PEDGE-$')
box_height_cube = other_cubes.extract_strict('BXHEIGHT_BXHGHT-$')
try:
n_air_cube = other_cubes.extract_strict('N(AIR)_BXHGHT-$')
except iris.exceptions.ConstraintMismatchError:
# ND49: air density datafields have a different name
n_air_cube = other_cubes.extract_strict('AIRDEN_TIME-SER')
# ND49: fix missing pressure level
pedges_cube = fix_nd49_pressure(pedges_cube)
# convert units for hydrocarbon tracers
for cube in tracer_cubes:
ppbC_2_ppbv(cube)
In [11]:
print tracer_cubes
print ''.join(np.repeat('-', 30))
print iris.cube.CubeList([pedges_cube, box_height_cube, n_air_cube])
In [12]:
extract_method = iris.analysis.interpolate.extract_nearest_neighbour
station_coords = [('latitude', station_lat), ('longitude', station_lon)]
tracer_profiles = iris.cube.CubeList(extract_method(cube, station_coords)
for cube in tracer_cubes)
pedges_profile, box_height_profile, n_air_profile = [
extract_method(cube, station_coords)
for cube in [pedges_cube, box_height_cube, n_air_cube]
]
all_profiles = tracer_profiles + [pedges_profile, box_height_profile, n_air_profile]
In [13]:
print all_profiles
print ''.join(np.repeat('-', 30))
print all_profiles[0]
In [14]:
def set_dim_order(cube, coord_names):
"""
Set the dimensions of `cube` in the order
defined by `coord_names`, i.e., a list that must
contain all of the cube's coordinate names.
Change the order of the cube dimensions in-place.
Warning: calling this method will trigger any
deferred loading, causing the cube’s data array
to be loaded into memory.
"""
ordered_dims = [cube.coord_dims(cube.coord(cn))[0]
for cn in coord_names]
cube.transpose(new_order=ordered_dims)
def get_dim_from_coord1d(cube, coord_name):
"""
Get from `cube` the dimension of the
1-dimensional coordinate specified by
`coord_name` (string).
Returns the dimension number of the coordinate.
Returns None if the coordinate is not dimensional.
Raises :err:`iris.exceptions.CoordinateMultiDimError`
if the coordinate is not 1-dimensional.
"""
c = cube.coord(coord_name)
cdims = cube.coord_dims(c)
if len(cdims) > 1:
raise iris.exceptions.CoordinateMultiDimError(
"'{}' coordinate must be 1-dimensional"
.format(coord_name)
)
elif not len(cdims):
return None
return cdims[0]
def get_altitude_coord(gridbox_heights, topography):
"""
Compute the altitude (elevation a.s.l) coordinate
from grid box heights and topography.
Parameters
----------
gridbox_heights : :class:`iris.cube.Cube`
a 1-d, 2-d, 3-d or 4-d cube.
The cube must have at least 'longitude',
'latitude' and 'model_level_number' coordinates.
'longitude' and 'latitude' can be either
1-dimensional or scalar.
'model_level_number' must be 1-dimensional and
should include all vertical levels.
topography : :class:`iris.cube.Cube`
a 2D cube, which must have 'longitude' and
'latitude' dimension coordinates. `topography`
must at least cover the horizontal extent of
`gridbox_heights` and have a compatible horizontal
grid (i.e., coordinates points and bounds must match).
Returns
-------
a :class:`iris.AuxCoord` object
An auxilliary coordinate that contains
space and time-dependent altitude values for
the points and bounds of each grid cell.
The coordinate has the same dimensions and
the same units than the `gridbox_heights` cube.
The coordinate also contains attributes of
the `topography` cube.
Examples
--------
>>> ctm_cubes = iris.load('ctm.bpch')
>>> box_heights = ctm.extract_strict('BXHEIGHT_BXHGHT-$')
>>> nox_tracer = ctm.extract_strict('NOx_IJ-AVG-$')
>>> global_topography = iris.load_cube('dem_GEOS57_2x2.5_awm.nc')
>>> altitude_coord = get_altitude_coord(box_heights,
... global_topography)
>>> nox_tracer.add_aux_coord(altitude_coord,
data_dims=range(0, nox_tracer.ndim))
"""
# extract the region of topography that cooresponds to
# the extent of gridbox_heights
gbh_lat = gridbox_heights.coord('latitude').points
gbh_lon = gridbox_heights.coord('longitude').points
subset = iris.Constraint(latitude=gbh_lat,
longitude=gbh_lon)
topography_subset = topography.extract(subset)
# ensure same units for gridbox_heights and topography
topography_subset.convert_units(gridbox_heights.units)
# ensure proper array broadcasting when computing altitude values
topo_lon_coord = topography_subset.coord('longitude')
topo_lat_coord = topography_subset.coord('latitude')
topo_dim_coords = topography_subset.dim_coords
if topography_subset.ndim == 0:
base_level_altitude = topography_subset.data
else:
gbh_lon_dim = get_dim_from_coord1d(gridbox_heights,
'longitude')
gbh_lat_dim = get_dim_from_coord1d(gridbox_heights,
'latitude')
if (topo_lon_coord in topo_dim_coords
and topo_lat_coord in topo_dim_coords):
set_dim_order(topography_subset, ['longitude', 'latitude'])
dim_map = (gbh_lon_dim, gbh_lat_dim)
elif topo_lon_coord in topo_dim_coords:
dim_map = (gbh_lon_dim,)
else:
dim_map = (gbh_lat_dim,)
base_level_altitude = iris.util.broadcast_to_shape(
topography_subset.data.copy(),
gridbox_heights.shape,
dim_map)
level_dim = get_dim_from_coord1d(gridbox_heights,
'model_level_number')
# calculate altitude points and bounds values
altitude_ubnd = np.cumsum(gridbox_heights.data, axis=level_dim) + base_level_altitude
altitude_lbnd = altitude_ubnd - gridbox_heights.data
altitude_lbnd[..., 1:] = altitude_ubnd[..., :-1] # force contiguous bounds
altitude_points = (altitude_lbnd + altitude_ubnd) / 2.
altitude_bounds = np.concatenate((altitude_lbnd[...,np.newaxis],
altitude_ubnd[...,np.newaxis]),
axis=-1)
# create and return the iris coordinate object
altitude_coord = iris.coords.AuxCoord(altitude_points,
bounds=altitude_bounds,
standard_name='altitude',
units=gridbox_heights.units,
attributes=topography.attributes.copy())
return altitude_coord
In [15]:
global_topography = iris.load_cube(global_topography_datafile)
altitude_coord = get_altitude_coord(box_height_profile,
global_topography)
for cube in tracer_profiles + [n_air_profile]:
if cube.coords('air_pressure'):
cube.remove_coord('air_pressure')
if cube.coords('altitude'):
cube.remove_coord('altitude')
cube.add_aux_coord(altitude_coord,
data_dims=range(0, box_height_profile.ndim))
In [16]:
station_profile = extract_method(iris.load_cube(station_vertical_grid_file),
station_coords)
def regrid_profile(profile_cube, station_profile):
"""
Vertical regridding of one profile
(slicing over time, regrid the slices, and re-merge).
"""
rprof_timeslices = []
for prof_timeslice in profile_cube.slices('model_level_number'):
rprof = regrid_conservative_vertical(prof_timeslice, station_profile)
rprof_timeslices.append(rprof)
return iris.cube.CubeList(rprof_timeslices).merge()[0]
regridded_tracer_profiles = iris.cube.CubeList([regrid_profile(p, station_profile)
for p in tracer_profiles])
regridded_n_air_profile = regrid_profile(n_air_profile, station_profile)
regridded_all_profiles = regridded_tracer_profiles + [regridded_n_air_profile]
In [17]:
print regridded_all_profiles
print ''.join(np.repeat('-', 30))
print regridded_n_air_profile
In [18]:
def compute_tracer_columns(mixing_ratio, n_air, z_coord,
cell_heights=None, weights=None,
units='count/cm2'):
"""
Compute tracer total columns.
Parameters
----------
mixing_ratio : :class:`iris.cube.Cube` object
Tracer mixing ratio.
n_air : :class:`iris.cube.Cube` object
Number density of air.
z_coord : string or :class:`iris.coords.Coord` object
(Name of) the vertical levels coordinate.
cell_heights : :class:`iris.cube.Cube` object or None
Grid cell heights. Must be provided if `z_coord`
can't be used to compute the height of each cell.
weights : :class:`iris.cube.Cube` object
vertical cell weights.
units : string or :class:iris.unit.`Unit` object
Units in which the results are returned.
Returns
-------
A :class:`iris.cube.Cube` object with the computed
total columns values.
Notes
-----
- This function computes total columns.
To compute partial columns, define `weights`
or reduce the cubes first.
- Partial columns are first calculated for each grid cell
as follows:
pcol = `mixing_ratio` * `N_air` * `cell_heights`
total columns are the sum of pcol over `z_coord`.
- Aware of units (e.g., ppbv, ppmv for `mixing_ratio`...).
specified in the input cubes metadata.
- For output units, 'count' is used instead of 'molec' (the latter
unit is not supported by udunits2), but the values are unchanged.
"""
if not isinstance(z_coord, iris.coords.Coord):
z_coord = mixing_ratio.coord(name=z_coord)
if weights is None:
weights = 1.
if cell_heights is None:
if not z_coord.has_bounds():
raise iris.exceptions.CoordinateNotRegularError(
'z_coord {} must have bounds'.format(z_coord.name)
)
if z_coord.units.symbol not in ['m', 'km']:
raise iris.exceptions.CoordinateNotRegularError(
"'m' or 'km' units required for z_coord, "
"found {} ({})".format(z_coord.units.symbol, z_coord.name)
)
map_height_dim = mixing_ratio.coord_dims(z_coord)
heights_data = z_coord.bounds[:,1] - z_coord.bounds[:,0]
heights_data = iris.util.broadcast_to_shape(heights_data,
mixing_ratio.shape,
map_height_dim)
cell_heights = mixing_ratio.copy()
cell_heights.data = heights_data
cell_heights.units = z_coord.units
# avoid missing values (convert to zero)
mixing_ratio = mixing_ratio.copy()
n_air = n_air.copy()
mixing_ratio.data = np.nan_to_num(mixing_ratio.data)
n_air.data = np.nan_to_num(n_air.data)
gridbox_partial_columns = mixing_ratio * n_air
gridbox_partial_columns *= cell_heights * weights
total_columns = gridbox_partial_columns.collapsed(z_coord,
iris.analysis.SUM)
total_columns.convert_units(units)
total_columns.long_name = "_".join([mixing_ratio.name(), 'columns'])
total_columns.attributes.update(mixing_ratio.attributes)
return total_columns
In [19]:
columns = [compute_tracer_columns(p,
regridded_n_air_profile,
'altitude')
for p in regridded_tracer_profiles]
tracer_columns = iris.cube.CubeList(columns)
print tracer_columns
In [20]:
def permute_dim_aux_coords(cube, dim_coord, aux_coord):
"""
Permute a dimension coordinate with
an auxilliary coordinate in a cube.
Parameters
----------
cube : :class:`iris.cube.Cube`
the cube to process (changes will be made in-place).
dim_coord : string or :class:`iris.coords.DimCoord`
(name of) the dimension coordinate to permute.
aux_coord : string or :class:`iris.coords.AuxCoord`
(name of) the auxilliary coordinate to permute.
must contain the same cube dimension than `dim_coord`
and must satisfy the criteria required for dimensional
coordinates (1D, numeric, and strictly monotonic).
"""
if not isinstance(dim_coord, iris.coords.DimCoord):
dim_coord = cube.coord(name=dim_coord)
if not isinstance(dim_coord, iris.coords.AuxCoord):
aux_coord = cube.coord(name=aux_coord)
if cube.coord_dims(aux_coord) != cube.coord_dims(dim_coord):
raise iris.exceptions.CoordinateNotRegularError(
"Cannot permute '{}' and '{}' coordinates"
.format(dim_coord.name(), aux_coord.name())
)
else:
dim = cube.coord_dims(dim_coord)[0]
new_dim_coord = iris.coords.DimCoord.from_coord(aux_coord)
new_aux_coord = iris.coords.AuxCoord.from_coord(dim_coord)
cube.remove_coord(dim_coord)
cube.remove_coord(aux_coord)
cube.add_dim_coord(new_dim_coord, dim)
cube.add_aux_coord(new_aux_coord, dim)
def remove_dim_aux_coords(cube, dimensions):
"""
Remove in `cube` all auxilliary coordinates
that contain `dimensions`.
"""
for coord in cube.coords(dimensions=dimensions):
if isinstance(coord, iris.coords.AuxCoord):
cube.remove_coord(coord)
In [21]:
out_profiles_basename = string.replace(out_profiles_basename, "*", station_name)
out_columns_basename = string.replace(out_columns_basename, "*", station_name)
out_profiles_basepath = os.path.join(out_dir, out_profiles_basename)
out_columns_basepath = os.path.join(out_dir, out_columns_basename)
In [22]:
def fix_cube_attributes_vals(cube):
"""
Iris issue ? when saving a cube to a netCDF file,
a TypeError is raised (by netCDF4) if there are
attributes with bool values.
Convert it to int.
"""
for key, val in cube.attributes.items():
if type(val) == bool:
cube.attributes[key] = int(val)
# make altitude as dimension coord
for profile_cube in regridded_tracer_profiles + [regridded_n_air_profile]:
if isinstance(profile_cube.coord('altitude'), iris.coords.AuxCoord):
permute_dim_aux_coords(profile_cube, 'model_level_number', 'altitude')
for cube in regridded_tracer_profiles + [regridded_n_air_profile]:
fix_cube_attributes_vals(cube)
for cube in tracer_columns:
fix_cube_attributes_vals(cube)
iris.save(regridded_tracer_profiles + [regridded_n_air_profile],
'.'.join([out_profiles_basepath, 'nc']))
iris.save(tracer_columns,
'.'.join([out_columns_basepath, 'nc']))
Series
, DataFrame
and/or Panel
objects from the profiles and columns cubes.
In [23]:
dataframe_profiles = {}
for profile in regridded_tracer_profiles + [regridded_n_air_profile]:
profile_cube = profile.copy()
# there must be only one defined dimension coordinate for each
# cube dimension (no auxilliary coordinate (convert iris to pandas)
z_dim = profile_cube.coord_dims(profile_cube.coord(name='altitude'))
remove_dim_aux_coords(profile_cube, z_dim)
dataframe_units = profile_cube.units.format()
if dataframe_units == 'unknown':
dataframe_units = profile_cube.attributes['no_udunits2']
dataframe_name = "{tracer} ({units})".format(
tracer=profile_cube.attributes['name'],
units=dataframe_units
)
# scalar time coordinate
if profile_cube.ndim == 1:
series = ipandas.as_series(profile)
time_coord = profile_cube.coord('time')
date = time_coord.units.num2date(time_coord.points[0])
dataframe_profiles[dataframe_name] = pd.DataFrame({date : series}).transpose()
# dimensional time coordinate
else:
dataframe_profiles[dataframe_name] = ipandas.as_data_frame(profile_cube)
panel_profiles = pd.Panel(dataframe_profiles).transpose(0, 2, 1).astype(np.float64)
series_columns = {}
for column in tracer_columns:
series_name = "{tracer} ({units})".format(
tracer=column.attributes['name'],
units='molec cm-2'
)
series_columns[series_name] = ipandas.as_series(column)
time_coord = column.coord('time')
date = time_coord.units.num2date(time_coord.points)
series_columns[series_name].index = date
dataframe_columns = pd.DataFrame(series_columns).astype(np.float64)
In [24]:
panel_profiles
Out[24]:
In [25]:
dataframe_columns
Out[25]:
In [26]:
if out_format in ('hdf', 'hdf5'):
panel_profiles.to_hdf('.'.join([out_profiles_basepath, 'hdf5']),
'profiles')
dataframe_columns.to_hdf('.'.join([out_columns_basepath, 'hdf5']),
'columns')
elif out_format in ('xls', 'xlsx'):
panel_profiles.to_excel('.'.join([out_profiles_basepath, out_format]),
'profiles')
dataframe_columns.to_excel('.'.join([out_columns_basepath, out_format]),
'columns')
elif out_format == 'csv':
panel_profiles.transpose(2, 0, 1).to_frame().to_csv(
'.'.join([out_profiles_basepath, 'csv'])
)
dataframe_columns.to_csv('.'.join([out_columns_basepath, out_format]))
In [27]:
!ls $out_dir -all -h
In [28]:
#pylab.rcParams['figure.figsize'] = (12.0, 10.0) # matplotlib default fig size
In [29]:
draw_n_prof = 10 # draw only the first N profiles.
originals = tracer_profiles + [n_air_profile]
regridded = regridded_tracer_profiles + [regridded_n_air_profile]
for profile_original, profile_regridded in zip(originals, regridded):
ts_original = profile_original.slices('model_level_number')
ts_regridded = profile_regridded.slices('model_level_number')
plt.figure()
inc = 0
for p_original, p_regridded in zip(ts_original, ts_regridded):
if inc > draw_n_prof:
break
ax = plt.subplot(121)
iqplot.plot(p_original, p_original.coord('altitude'), '-+')
ax2 = plt.subplot(122)
iqplot.plot(p_regridded, p_regridded.coord('altitude'), '-+')
ax2.set_xlim(ax.get_xlim())
ymin = p_regridded.coord('altitude').points[0]
ymax = p_regridded.coord('altitude').points[-1]
ax.set_ylim([ymin, ymax])
ax2.set_ylim([ymin, ymax])
inc += 1
In [30]:
dataframe_columns.plot()
Out[30]:
In [30]: