Getting started with OGGM: Ötztal case study

The OGGM workflow is best explained with an example. We are going to use the case we use for testing the oggm codebase. The test files are located in a dedicated online repository, oggm-sample-data.

Input data

In the test-workflow directory you can have a look at the various files we will need. oggm also needs them for testing, so they are automatically available to everybody with a simple mechanism:


In [ ]:
import oggm
from oggm import cfg
from oggm.utils import get_demo_file
cfg.initialize()
srtm_f = get_demo_file('srtm_oetztal.tif')
rgi_f = get_demo_file('rgi_oetztal.shp')
print(srtm_f)

The very first time that you make a call to get_demo_file(), oggm will create a hidden .oggm directory in your home folder$^*$ and download the demo files in it.

*: this path might vary depending on your platform, see python's [expanduser](https://docs.python.org/3.5/library/os.path.html#os.path.expanduser)

DEM and glacier outlines

The data directory contains a subset of the RGI V4 for the Ötztal:


In [ ]:
import salem  # https://github.com/fmaussion/salem
rgi_shp = salem.read_shapefile(rgi_f).set_index('RGIId')

We'll have a look at it, but first we will need to make some imports and set some defaults:


In [ ]:
# Plot defaults
%matplotlib inline
import matplotlib.pyplot as plt
# Packages
import os
import numpy as np
import xarray as xr
import shapely.geometry as shpg
plt.rcParams['figure.figsize'] = (8, 8)  # Default plot size

Plot the glaciers of the Ötztal case study:


In [ ]:
rgi_shp.plot();

Calibration / validation data

These 19 glaciers were selected because they have either mass-balance data (WGMS) or total volume information (GlaThiDa). These data are required for calibration/validation and are available automatically in OGGM.

Climate data

For this test case we use HISTALP data (which goes back further in time than CRU), stored in the NetCDF format. The resolution of HISTALP (5 minutes of arc) is relatively high, but some kind of downscaling will be necessary to compute the mass-balance at the glacier scale.

We can plot a timeseries of the data, for example for the grid point (3, 3):


In [ ]:
fig = plt.figure(figsize=(9, 3))
ds = xr.open_dataset(get_demo_file('HISTALP_oetztal.nc'))
ds.temp[:, 3, 3].resample('AS', dim='time').plot()
plt.title('HISTALP annual temperature (°C)');

Setting up an OGGM run

OGGM parameters are gathered in a configuration file. The default file is shipped with the code. It is used to initialize the configuration module:


In [ ]:
from oggm import cfg
from oggm import workflow
cfg.initialize()  # read the default parameter file

For example, the cfg module has a global variable PATHS (a dictionary) storing the file paths to the data and working directories:


In [ ]:
cfg.PATHS

The path to the input data files are missing. Let's set them so that the oggm modules know where to look for them (the default would be to download them automatically, which we would like to avoid for this example):


In [ ]:
cfg.PATHS['dem_file'] = get_demo_file('srtm_oetztal.tif')
cfg.PATHS['climate_file'] = get_demo_file('HISTALP_oetztal.nc')

We will set the "border" option to a larger value, since we will do some dynamical simulations ("border" decides on the number of DEM grid points we'd like to add to each side of the glacier for the local map: the larger the glacier will grow, the larger border should be):


In [ ]:
cfg.PARAMS['border'] = 80

We keep the other parameters to their default values, for example the precipitation scaling factor:


In [ ]:
cfg.PARAMS['prcp_scaling_factor']

Glacier working directories

An OGGM "run" is made of several successive tasks to be applied on each glacier. Because these tasks can be computationally expensive they are split in smaller tasks, each of them storing their results in a glacier directory.

The very first task of an OGGM run is always init_glacier_regions:


In [ ]:
# Read in the RGI file
import geopandas as gpd
rgi_file = get_demo_file('rgi_oetztal.shp')
rgidf = gpd.GeoDataFrame.from_file(rgi_file)
# Initialise directories
# reset=True will ask for confirmation if the directories are already present: 
# this is very useful if you don't want to loose hours of computations because of a command gone wrong
gdirs = oggm.workflow.init_glacier_regions(rgidf, reset=True)

Note that if I run init_glacier_regions a second time without reset=True, nothing special happens. The directories will not be overwritten, just "re-opened":


In [ ]:
gdirs = workflow.init_glacier_regions(rgidf)

Now what is the variable gdirs? It is a list of 19 GlacierDirectory objects. They are here to help us to handle data input/output and to store several glacier properties. Here are some examples:


In [ ]:
gdir = gdirs[13]
gdir

gdir provides a get_filepath function which gives access to the data files present in the directory:


In [ ]:
gdir.get_filepath('dem')

dem.tif is a local digital elevation map with a spatial resolution chosen by OGGM as a function of the glacier size. These GlacierDirectory objects are going to be the input of almost every OGGM task.

This data model has been chosen so that even complex functions requires serval input data can be called with one single argument:


In [ ]:
from oggm import graphics
graphics.plot_googlemap(gdir)

OGGM tasks

The workflow of OGGM is oriented around the concept of "tasks". There are two different types:

Entity Task: Standalone operations to be realized on one single glacier entity, independently from the others. The majority of OGGM tasks are entity tasks. They are parallelisable.

Global Task: tasks which require to work on several glacier entities at the same time. Model parameter calibration or interpolation of degree day factors belong to this type of task. They are not parallelisable.

OGGM implements a simple mechanism to run a specific task on a list of GlacierDir objects (here, the function glacier_masks() from the module oggm.prepro.gis):


In [ ]:
from oggm import tasks

In [ ]:
# run the glacier_masks task on all gdirs
workflow.execute_entity_task(tasks.glacier_masks, gdirs)

We just computed gridded boolean masks out of the RGI outlines.

It is also possible to apply several tasks sequentially:


In [ ]:
list_talks = [
         tasks.compute_centerlines,
         tasks.compute_downstream_lines,
         tasks.catchment_area,
         tasks.initialize_flowlines,
         tasks.catchment_width_geom,
         tasks.catchment_width_correction,
         tasks.compute_downstream_bedshape
         ]
for task in list_talks:
    workflow.execute_entity_task(task, gdirs)

The function execute_task can run a task on different glaciers at the same time, if the use_multiprocessing option is set to True in the configuration file.

With all these tasks we just computed the glacier flowlines and their width:


In [ ]:
graphics.plot_catchment_width(gdir, corrected=True)

Global tasks, climate tasks

We will go into more detail about tasks in the documentation. For now, we will use the helper function:


In [ ]:
workflow.climate_tasks(gdirs)

We just read the climate data, "downscaled" it to each glacier, computed possible $\mu^*$ for the reference glaciers, picked the best one, interpolated the corresponding $t^*$ to glaciers without mass-balance observations, computed the mass-balance sensitivity $\mu$ for all glaciers and finally computed the mass-balance at equilibrium (the "apparent mb" in Farinotti et al., 2009).

Finally, we will prepare the data for the inversion, which is an easy:


In [ ]:
workflow.execute_entity_task(tasks.prepare_for_inversion, gdirs)

Inversion

This is where things become a bit more complicated. The inversion is already fully automated in OGGM, but for this tutorial we will try to explain in more detail what is happening.

Let's start with the funcion mass_conservation_inversion:


In [ ]:
from oggm.core.preprocessing.inversion import mass_conservation_inversion

This function will compute the ice thickness along the flowline. It has one free parameter (or too, if you also want to consider the basal sliding term in the inversion): Glen's deformation parameter A. Let's compute the bed inversion for Hintereisferner and the default A:


In [ ]:
# Select HEF out of all glaciers
gdir_hef = [gd for gd in gdirs if (gd.rgi_id == 'RGI50-11.00897')][0]

In [ ]:
glen_a = cfg.A
vol_m3, area_m3 = mass_conservation_inversion(gdir_hef, glen_a=glen_a)
print('With A={}, the mean thickness of HEF is {:.1f} m'.format(glen_a, vol_m3/area_m3))

In [ ]:
graphics.plot_inversion(gdir_hef)

We know from the literature (Fisher et al, 2013) that the HEF should have an average thickness of 67$\pm$7 m. How sensitive is the inversion to changes in the A parameter?


In [ ]:
factor = np.linspace(0.1, 10, 30)
thick = factor*0
for i, f in enumerate(factor):
    vol_m3, area_m3 = mass_conservation_inversion(gdir_hef, glen_a=glen_a*f)
    thick[i] = vol_m3/area_m3
plt.figure(figsize=(6, 4))
plt.plot(factor, thick);
plt.ylabel('Mean thickness (m)');
plt.xlabel('Multiplier');

The A parameter controls the deformation of the ice, and therefore the thickness. It is always possible to find a "perfect" A for each glacier with measurements, for example by using an optimisation function. The current way to deal with this in OGGM is to use all glaciers with a volume estimate from the GLaThiDa database, and define A so that the volume RMSD is minimized. The reason for choosing the volume (which is strongly affected by the area) over the thickness is that with this method, larger glaciers will have more influence on the final results.


In [ ]:
optim_resuls = tasks.optimize_inversion_params(gdirs)

The optimize_inversion_params task also writes some statistics in the working directory:


In [ ]:
import pandas as pd
fpath = os.path.join(cfg.PATHS['working_dir'], 'inversion_optim_results.csv')
df = pd.read_csv(fpath, index_col=0)
df['ref_thick'] = df['ref_volume_km3'] / df['ref_area_km2'] * 1e3
df['oggm_thick'] = df['oggm_volume_km3'] / df['ref_area_km2'] * 1e3
df['vas_thick'] = df['vas_volume_km3'] / df['ref_area_km2'] * 1e3

In [ ]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6))
ax1.scatter(df['ref_thick'], df['oggm_thick'], s=100)
ax1.set_title('OGGM RMSD: {:.2f}'.format(oggm.utils.rmsd(df['ref_thick'], df['oggm_thick'])))
ax1.set_xlabel('Ref thickness')
ax1.set_ylabel('OGGM thickness')
ax1.plot([0, 100], [0, 100], '.:k', zorder=0);
ax1.set_xlim([0, 100]), ax1.set_ylim([0, 100]);
ax2.scatter(df['ref_thick'], df['vas_thick'], s=100)
ax2.set_title('Volume-Area RMSD: {:.2f}'.format(oggm.utils.rmsd(df['ref_thick'], df['vas_thick'])))
ax2.set_xlabel('Ref thickness')
ax2.set_ylabel('VAS thickness')
ax2.plot([0, 100], [0, 100], '.:k', zorder=0);
ax2.set_xlim([0, 100]), ax2.set_ylim([0, 100]);

Finalize the inversion


In [ ]:
# Use the optimal parameters for inveting all glaciers and apply a simple correction filter
workflow.execute_entity_task(tasks.volume_inversion, gdirs)
workflow.execute_entity_task(tasks.filter_inversion_output, gdirs)

Flowline model

All the previous steps are necessary to run the flowline model: the computation of the flowline(s) and their width, the interpolation of the climate data, the mass-balance sensitivity $\mu$, an estimate of the glacier bed...

All this data are stored in the glacier directories. For example for HEF the data should be approx 2.2 Mb. You can explore the various files available in the directory printed below:


In [ ]:
print(gdir_hef.dir)

The files are partly documented here.

The first task to apply before using the model is the init_present_time_glacier function:


In [ ]:
tasks.init_present_time_glacier(gdir_hef)

This task is required to merge the various glacier divides back together and to allow the glacier to grow by adding the downstream flowlines. This function also decides on the shape of the glacier bed along the flowlines and downstream (currently an "average" parabolic shape is chosen). Let's initialize our model with this geometry:


In [ ]:
from oggm.core.models.flowline import FluxBasedModel
# the flowlines alone
fls = gdir_hef.read_pickle('model_flowlines')
model = FluxBasedModel(fls)
graphics.plot_modeloutput_map(gdir_hef, model=model);

A cross-section along the glacier can be visualized with the following function:


In [ ]:
graphics.plot_modeloutput_section(gdir, model=model);

Mass balance

To run the model, one has to define a mass-balance function. They are implemented in the massbalance module:


In [ ]:
from oggm.core.models.massbalance import ConstantMassBalanceModel
from oggm.core.models.massbalance import PastMassBalanceModel

For example, let's have a look at the mass-balance profile of HEF for the period 1970-2000, for the period $t^*$, and for the year 2003:


In [ ]:
today_model = ConstantMassBalanceModel(gdir_hef, y0=1985)
tstar_model = ConstantMassBalanceModel(gdir_hef)
hist_model = PastMassBalanceModel(gdir_hef)

In [ ]:
# Altitude of the main flowline:
z = model.fls[-1].surface_h
# Get the mass balance and convert to m per year
mb_today = today_model.get_annual_mb(z) * cfg.SEC_IN_YEAR * cfg.RHO / 1000.
mb_tstar = tstar_model.get_annual_mb(z) * cfg.SEC_IN_YEAR * cfg.RHO / 1000.
mb_2003 = hist_model.get_annual_mb(z, 2003) * cfg.SEC_IN_YEAR * cfg.RHO / 1000.
# Plot
plt.figure(figsize=(8, 5))
plt.plot(mb_today, z, '*', label='1970-2000');
plt.plot(mb_tstar, z, '*', label='t*');
plt.plot(mb_2003, z, '*', label='2003');
plt.ylabel('Altitude (m)');
plt.xlabel('Annual MB (m we)');
plt.legend(loc='best');

Define a model run

For a complete run you need to specify an initial state, a mass-balance model and the ice-flow parameter(s):


In [ ]:
fls = gdir_hef.read_pickle('model_flowlines')
commit_model = FluxBasedModel(fls, mb_model=today_model, glen_a=cfg.A)

It is now possible to run the model for any period of time:


In [ ]:
# Run for 50 years
commit_model.run_until(50)
graphics.plot_modeloutput_section(gdir_hef, model=commit_model)

Or until an equilibrium is reached (in this case it is possible because the mass-balance is constant in time):


In [ ]:
commit_model.run_until_equilibrium()
graphics.plot_modeloutput_section(gdir_hef, model=commit_model)

In [ ]:
graphics.plot_modeloutput_map(gdir_hef, model=commit_model)

This is a very good example of how surprising glaciers can be. Let's redo this run and store the glacier evolution with time:


In [ ]:
# Reinitialize the model (important!)
fls = gdir_hef.read_pickle('model_flowlines')
commit_model = FluxBasedModel(fls, mb_model=today_model, glen_a=cfg.A)
# Run and store
years = np.arange(200) * 2
volume = np.array([])
for y in years:
    commit_model.run_until(y)
    volume = np.append(volume, commit_model.volume_m3)
# Plot
plt.figure(figsize=(8, 5))
plt.plot(years, volume)
plt.ylabel('Volume (m3)');
plt.xlabel('Time (years)');

How important is the A parameter for the equilibrium volume?


In [ ]:
# Reinitialize the model (important!)
fls = gdir_hef.read_pickle('model_flowlines')
commit_model_1 = FluxBasedModel(fls, mb_model=today_model, glen_a=cfg.A*1)
fls = gdir_hef.read_pickle('model_flowlines')
commit_model_2 = FluxBasedModel(fls, mb_model=today_model, glen_a=cfg.A*2)
fls = gdir_hef.read_pickle('model_flowlines')
commit_model_3 = FluxBasedModel(fls, mb_model=today_model, glen_a=cfg.A*3)
# Run and store
years = np.arange(200) * 2
volume_1 = np.array([])
volume_2 = np.array([])
volume_3 = np.array([])
for y in years:
    commit_model_1.run_until(y)
    volume_1 = np.append(volume_1, commit_model_1.volume_m3)
    commit_model_2.run_until(y)
    volume_2 = np.append(volume_2, commit_model_2.volume_m3)
    commit_model_3.run_until(y)
    volume_3 = np.append(volume_3, commit_model_3.volume_m3)
# Plot
plt.figure(figsize=(8, 5))
plt.plot(years, volume_1, label='1.0 A')
plt.plot(years, volume_2, label='2.0 A')
plt.plot(years, volume_3, label='3.0 A')
plt.ylabel('Volume (m3)');
plt.xlabel('Time (years)');
plt.legend(loc='best');

Random runs

Equilibrium runs of course are not realistic. The normal variability of climate can lead to retreats and advances without any external forcing. OGGM therfore implements a random mass-balance model, which simply shuffles the observed years during a selected period of time.


In [ ]:
from oggm.core.models.massbalance import RandomMassBalanceModel

In [ ]:
# Define the mass balance model
random_today = RandomMassBalanceModel(gdir, y0=1985, seed=0)

# Plot th specifi mass-balance
h, w = gdir.get_inversion_flowline_hw()
years = np.arange(1000)
mb_ts = random_today.get_specific_mb(h, w, year=years)
plt.figure(figsize=(10, 4))
plt.plot(years, mb_ts);

As you can see, the mass-balance has no visible trend. The time-series are not stricly gaussians, since only "observed" years can happen: the randomness occurs in the sequence of the events.

Let's make a run with this mass-balance:


In [ ]:
fls = gdir_hef.read_pickle('model_flowlines')
random_model = FluxBasedModel(fls, mb_model=random_today, glen_a=cfg.A*3)
# Run and store
years = np.arange(500) * 2
volume = np.array([])
for y in years:
    random_model.run_until(y)
    volume = np.append(volume, random_model.volume_m3)
# Plot
plt.figure(figsize=(8, 5))
plt.plot(years, volume)
plt.ylabel('Volume (m3)');
plt.xlabel('Time (years)');

Let's see what influence does the Glen's parameter A have on the glacier evolution. Note that if we use the same mass-balance model for all runs they will all have the same random climate sequence! This is very useful for various reasons.


In [ ]:
# Reinitialize the model (important!)
fls = gdir_hef.read_pickle('model_flowlines')
random_model_1 = FluxBasedModel(fls, mb_model=random_today, glen_a=cfg.A*1)
fls = gdir_hef.read_pickle('model_flowlines')
random_model_2 = FluxBasedModel(fls, mb_model=random_today, glen_a=cfg.A*2)
fls = gdir_hef.read_pickle('model_flowlines')
random_model_3 = FluxBasedModel(fls, mb_model=random_today, glen_a=cfg.A*3)
# Run and store
years = np.arange(250) * 2
volume_1 = np.array([])
volume_2 = np.array([])
volume_3 = np.array([])
for y in years:
    random_model_1.run_until(y)
    volume_1 = np.append(volume_1, random_model_1.volume_m3)
    random_model_2.run_until(y)
    volume_2 = np.append(volume_2, random_model_2.volume_m3)
    random_model_3.run_until(y)
    volume_3 = np.append(volume_3, random_model_3.volume_m3)
# Plot
plt.figure(figsize=(8, 5))
plt.plot(years, volume_1, label='1.0 A')
plt.plot(years, volume_2, label='2.0 A')
plt.plot(years, volume_3, label='3.0 A')
plt.ylabel('Volume (m3)');
plt.xlabel('Time (years)');
plt.legend(loc='best');

After the spin-up time, the three models have quite similar evolutions but quite different volumes!

Let's use different random series this time, keeping A constant:


In [ ]:
# Reinitialize the model (important!)
fls = gdir_hef.read_pickle('model_flowlines')
random_today = RandomMassBalanceModel(gdir, y0=1985, seed=1)
random_model_1 = FluxBasedModel(fls, mb_model=random_today, glen_a=cfg.A*3)
fls = gdir_hef.read_pickle('model_flowlines')
random_today = RandomMassBalanceModel(gdir, y0=1985, seed=2)
random_model_2 = FluxBasedModel(fls, mb_model=random_today, glen_a=cfg.A*3)
fls = gdir_hef.read_pickle('model_flowlines')
random_today = RandomMassBalanceModel(gdir, y0=1985, seed=3)
random_model_3 = FluxBasedModel(fls, mb_model=random_today, glen_a=cfg.A*3)
# Run and store
years = np.arange(250) * 2
volume_1 = np.array([])
volume_2 = np.array([])
volume_3 = np.array([])
for y in years:
    random_model_1.run_until(y)
    volume_1 = np.append(volume_1, random_model_1.volume_m3)
    random_model_2.run_until(y)
    volume_2 = np.append(volume_2, random_model_2.volume_m3)
    random_model_3.run_until(y)
    volume_3 = np.append(volume_3, random_model_3.volume_m3)
# Plot
plt.figure(figsize=(8, 5))
plt.plot(years, volume_1, label='Seed 1')
plt.plot(years, volume_2, label='Seed 2')
plt.plot(years, volume_3, label='Seed 3')
plt.ylabel('Volume (m3)');
plt.xlabel('Time (years)');
plt.legend(loc='best');

Historical runs

Now this is where it becomes interesting. Let's define a run with "real" mass-balance time-series. Let's assume that the 1850 glacier geometry is the same as today's, and run the model over 153 years:


In [ ]:
# Reinitialize the model (important!)
fls = gdir_hef.read_pickle('model_flowlines')
# Same as before with another mass-balance model and a real starting year y0:
model = FluxBasedModel(fls, mb_model=hist_model, glen_a=cfg.A*3, y0=1850)  
# Run and store
years = np.arange(153) + 1851
volume = np.array([])
for y in years:
    model.run_until(y)
    volume = np.append(volume, model.volume_m3)
# Plot
plt.figure(figsize=(8, 5))
plt.plot(years, volume)
plt.ylabel('Volume (m3)');
plt.xlabel('Time (years)');

Today's HEF would probably be too small to be in equilibrium with the climate during most of the simulation period. The time it needs to re-adjust depends on the glacier characteristics as well as A. We need a way to make our glacier grow first, so that it can shrink as we expect it to do.


In [ ]:
# Reinitialize the model (important!)
fls = gdir_hef.read_pickle('model_flowlines')
# Grow model: tstar climate
mb_model = ConstantMassBalanceModel(gdir_hef)
mb_model.temp_bias = -0.2
grow_model = FluxBasedModel(fls, y0=0, mb_model=mb_model, glen_a=cfg.A*3)
# run until equilibrium
grow_model.run_until_equilibrium()
# plot
graphics.plot_modeloutput_map(gdir_hef, model=grow_model)

Ok. Now reinitialize the historical run with this new input and see:


In [ ]:
# Reinitialize the model with the new geom
fls = grow_model.fls
model = FluxBasedModel(fls, mb_model=hist_model, glen_a=cfg.A*3, y0=1850)  
# Run and store
years = np.arange(153) + 1851
volume = np.array([])
for y in years:
    model.run_until(y)
    volume = np.append(volume, model.volume_m3)
# Plot
plt.figure(figsize=(8, 5))
plt.plot(years, volume)
plt.ylabel('Volume (m3)');
plt.xlabel('Time (years)');

Looks better! But still not perfect:


In [ ]:
graphics.plot_modeloutput_map(gdir_hef, model=model)

There's still a lot to do!


In [ ]: