In [ ]:
%matplotlib inline

Iris introduction course

6. Data Processing

Learning Outcome: by the end of this section, you will be able to apply arithmetic and statistical operations on a Cube.

Duration: 1 hour

Overview:
6.1 Cube Arithmetic
6.2 Aggregation and Statistics
6.3 Exercise
6.4 Summary of the Section

Setup


In [ ]:
import iris
import numpy as np

6.1 Cube Arithmetic

Basic mathematical operators exist on the cube to allow one to add, subtract, divide, multiply and perform other mathematical operations on cubes of a similar shape to one another:


In [ ]:
a1b = iris.load_cube(iris.sample_data_path('A1B_north_america.nc'))
e1 = iris.load_cube(iris.sample_data_path('E1_north_america.nc'))

print(e1.summary(True))
print(a1b)

In [ ]:
scenario_difference = a1b - e1
print(scenario_difference)

Notice that the resultant cube's name is now unknown. Also, the resultant cube's attributes and cell_methods have disappeared; this is because these were different between the two input cubes.


It is also possible to operate on cubes with numeric scalars, NumPy arrays and even cube coordinates.

Exercise:

Can you multiply the 'e1' air temperature cube by its own latitude coordinate ?
What are the units of the result ?


In [ ]:
#
# edit space for user code
#

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.1b

Although a cube's units can be freely set to any valid unit, the calculation of result units and compatibility checking is built into the arithmetic operations.

For example, let's create two new cubes, one with units of "feet" and one with units of "days", and divide them :


In [ ]:
six_feet = iris.cube.Cube(6.0, units='feet')
twelve_days = iris.cube.Cube(12.0, units='days')
print(six_feet / twelve_days)
Exercise:

What do you predict will result from adding together the 'six_feet' and 'twelve_days' cubes ?


In [ ]:
#
# edit space for user code
#

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.1c

Note that you can update the cube's data and metadata directly, for instance by assigning to cube.data, cube.standard_name or cube.units. When you do this, though, you need to be careful that the metadata is still an accurate description. By changing data explicitly you are basically stating that the result is correct.

Exercise:

First rename() the "six_feet" cube to 'depth' (which is a valid standard name).
What happens if you then rename it to 'potential_temperature' ?
What is the meaning of the resulting data cube ?
What happens if you then set the units of this to a mass (e.g. 'kg' ) ?


In [ ]:
#
# edit space for user code
#

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.1d

Another function of cube arithmetic is to support 'broadcasting', in the numpy sense : operations between data with different shapes.

In fact we already saw this above, with product = e1 * e1.coord('latitude').

Broadcasting is simpler in Iris than in numpy, because the dimensions are "lined up" by matching their coordinates, rather than depending on the ordering of dimensions.

Exercise:

In the above example, product = e1 * e1.coord('latitude'), the data array content comes from multiplying e1.data times e1.coord('latitude').points.
What happens if you simply multiply those two arrays ? Are the values the same ?


In [ ]:
#
# edit space for user code
#

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.1e

An even simpler example of broadcasting is doing arithmetic between a cube and a scalar value.

Exercise:

What happens if you add the scalar value 5.2 to the e1 cube ?
What is the meaning of the result ?


In [ ]:
#
# edit space for user code
#

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.1f

If the scalar is just a value, like this one, then it is assumed to have the same units as the cube.

However, a scalar cube or coordinate has its own units, which take part in the calculation, as seen above in the "feet per day" calculation.

6.2 Cube aggregation and statistics

Many standard univariate aggregations exist in Iris. Aggregations allow one or more dimensions of a cube to be statistically collapsed for the purposes of statistical analysis of the cube's data. Iris uses the term 'aggregators' to refer to the statistical operations that can be used for aggregation.

A list of aggregators is available at http://scitools.org.uk/iris/docs/latest/iris/iris/analysis.html.


In [ ]:
fname = iris.sample_data_path('uk_hires.pp')
cube = iris.load_cube(fname, 'air_potential_temperature')
print(cube.summary(True))

To take the vertical mean of this cube:


In [ ]:
print(cube.collapsed('model_level_number', iris.analysis.MEAN))

NOTE: the printout shows that the result has a cell method of "mean: model_level_number". Cell methods are a CF metadata convention which records that data are the results of statistical operations.


Exercise:

How can you calculate all-time minimum temperatures for this data, and what is the form of the result ?


In [ ]:
#
# edit space for user code
#

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.2a

In addition to "collapse", other types of statistical reductions are also possible. These also use aggregators to define the statistic. See the following documentation areas :

6.3 : Section Review Exercise

Let's apply all that we've learned about data processing and visualisation in Iris. We will perform data processing and visualisation to compare two possible climate futures scenarios, called the A1B scenario and the E1 scenario.

1. Load data

Load as cubes the datasets found at iris.sample_data_path('E1_north_america.nc') and iris.sample_data_path('A1B_north_america.nc'). Print a contents summary of each cube.


In [ ]:
# user code ...

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.3a

2. Calculate the difference between scenarios for a given year

Produce cubes covering only the year 2099, from both scenarios. Calculate the temperature difference between them.


In [ ]:
# user code ...

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.3b

3. Plot E1, A1B and difference side by side

Plot the data in a single figure with three maps, side-by-side in one row :

  • the air temperature in the E1 scenario for the year 2099,
  • the air temperature in the A1B scenario for the year 2099, and
  • the difference between the two scenarios.

Think about the most appropriate plot functions, and the matplotlib colormap(s) to use for each plot.

Hint: the different matplotlib colormaps can be seen at https://matplotlib.org/1.5.3/examples/color/colormaps_reference.html.


In [ ]:
# user code ...

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.3c

4. Produce time sequences of regional average air temperatures

Extract data for latitudes in a band of 25 to 30 degrees, and reduce it to a single average value at each time. This gives a 1-dimensional time sequence for each scenario. Calculate the model difference between these two.


In [ ]:
# user code ...

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.3d

5. Draw comparison line plots

Make a single plot with the data from the two absolute temperature cubes you produced in part 4. Make sure you label the lines you plot. Also plot the difference "e1 - a1b" for comparison.


In [ ]:
# user code ...

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.3e

6.4 Section Summary: Data processing

In this section we learnt:

  • cubes can be combined with arithmetic operators like addition, as for numpy arrays. Broadcasting also works.
  • coordinates can also be used in cube arithmetic.
  • operators are provided to perform statistical aggregations of cube data.
  • statistics can be calculated over selected dimensions, identified by coordinates.