In [ ]:
%matplotlib inline

Iris introduction course

3. Subcube Extraction

Learning outcome: by the end of this section, you will be able to use various Iris facilities to extract sub-sections of a dataset.

Duration: 1 hour

Overview:
3.1 Indexing
3.2 Constraints and Extraction
3.3 Iterating Over a Cube
3.4 Exercise
3.5 Summary of the Section

Setup


In [ ]:
import iris

3.1 Indexing

Cubes can be indexed in a similar manner to that of NumPy arrays:


In [ ]:
fname = iris.sample_data_path('uk_hires.pp')
cube = iris.load_cube(fname, 'air_potential_temperature')
print(cube.summary(shorten=True))

In [ ]:
subcube = cube[..., ::2, 15:35, :10]
subcube.summary(shorten=True)

The index operation above selects:

  • The first 10 elements from the last dimension, which is grid_longitude
  • All elements between the 15th and 35th element of the grid_latitude dimension
  • Every second element from the model_level_number dimension
  • All elements in the preceding dimensions.

Note: the result of indexing a cube is always a copy and never a view on the original data.

Exercise:

Try indexing the subcube from above to select the the first 2 elements from the time dimension.


In [ ]:
#
# edit space for user code ...
#

SAMPLE SOLUTION:
Un-comment and execute the following, to view a possible code solution.
Then run it ...


In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_3.1a

3.2 Constraints and Extraction

We've already seen the basic load function, which will load in a CubeList of all the cubes that Iris finds in the given files.

To control which cubes are actually loaded we can use constraints. The simplest constraint is just a string, which filters cubes based on their name.

Below, we load in uk_hires.pp file from Iris's sample data with a constraint so that only cubes named air_potential_temperature are loaded:


In [ ]:
fname = iris.sample_data_path('uk_hires.pp')
print(iris.load(fname, 'air_potential_temperature'))

Iris's constraints mechanism provides a powerful way to filter a subset of data from a larger collection. We've already seen that constraints can be used at load time to return data of interest from a file, but we can also apply constraints to a single cube, or a CubeList, using their respective extract methods.

Below, we load in the same file as before, this time without supplying any contraint. Then we use the extract method with the constraint specifying the cube's name.


In [ ]:
cubes = iris.load(fname)
potential_temperature_cubes = cubes.extract('air_potential_temperature')
print(potential_temperature_cubes)

The above two examples demonstrate the simplest constraint, namely a string that matches a cube's name, which is conveniently converted into an iris.Constraint instance wherever needed.

However, we could construct this constraint manually and compare with the previous result:


In [ ]:
pot_temperature_constraint = iris.Constraint('air_potential_temperature')

pt2_cubes = cubes.extract(pot_temperature_constraint)
print(pt2_cubes)
print(pt2_cubes == potential_temperature_cubes)

So far we have shown constraining at load time and extracting from a CubeList. We can also perform an extract operation on a Cube.

As before, we constrain the CubeList to select only cubes named air_potential_temperature. We then index out the first cube from the CubeList that is returned. On this cube, we extract model level number 10. The Constraint constructor takes arbitrary keywords to constrain coordinate values.


In [ ]:
temp_cubes = cubes.extract('air_potential_temperature')
temp_cube = temp_cubes[0]
print(temp_cube.extract(iris.Constraint(model_level_number=10)))

Note that this now returns a Cube; an extract operation on a CubeList will return a CubeList, an extract operation on a Cube will return a Cube.

You will notice that the Cube returned no longer has a model_level_number dimension, resulting in a 3D cube rather than a 4D cube. model_level_number has been demoted from a dimension coordinate to a scalar coordinate.

Exercise:

Take a look at the documentation for iris.Constraint. How would you modify the above example to select two model level numbers [4,10]?


In [ ]:
#
# edit space for user code ...
#

In [ ]:
# SAMPLE SOLUTION
# %load  solutions/iris_exercise_3.2a

As you may have seen in the iris.Constraint documentation, we can also make a Constraint from an arbitrary function that operates on each cell of a coordinate. This lets you perform more complicated extraction operations.


In [ ]:
def less_than_10(cell):
    """Return True for values that are less than 10."""
    return cell < 10

print(cubes.extract(iris.Constraint('air_potential_temperature',
                                    model_level_number=less_than_10)))
Exercise:

Load in the `air_temperature` cube from the file in iris.sample_data_path('air_temp.pp').
From this cube, extract data within 30 degrees of the equator.
Hint: Write a function that returns True for values that are in the range [-30,30].


In [ ]:
#
# edit space for user code ...
#

In [ ]:
# SAMPLE SOLUTION
# %load  solutions/iris_exercise_3.2b

Combining multiple constraints can be done in couple of ways. One way involves creating a single Constraint instance with multiple requirements.

For example, using our original CubeList, we extract a cube named air_potential_temperature with model level number 10.


In [ ]:
pot_temperature_constraint = iris.Constraint('air_potential_temperature',
                                             model_level_number=10)
print(cubes.extract(pot_temperature_constraint))

To combine different Constraints we use an &.

For example, below we extract the data named air_potential_temperature, and with model level numbers 4 and 10.


In [ ]:
print(cubes.extract('air_potential_temperature' & 
                    iris.Constraint(model_level_number=[4, 10])))

Time Constraints

It is common to want to build a constraint for time.
This can be achieved by comparing cells containing datetimes

There are a few different approaches for producing time constraints in Iris. We will focus here on one approach for constraining on time in Iris.

This approach allows us to access individual components of cell datetime objects and run comparisons on those:


In [ ]:
time_constraint = iris.Constraint(time=lambda cell: cell.point.hour == 11)
print(temp_cube.extract(time_constraint).summary(True))
Exercise:

Try indexing temp_cube to select days that are the 19th of the month.


In [ ]:
#
# edit space for user code ...
#

In [ ]:
# SAMPLE SOLUTION
# %load  solutions/iris_exercise_3.2c

3.3 Iterating Over a Cube

We can loop through subcubes within a larger cube using the cube methods slices and slices_over.

To demonstrate this we start with a cube that we have constrained such that it is named "air_potential_temperature" and has data on model level number=1.


In [ ]:
fname = iris.sample_data_path('uk_hires.pp')
cube = iris.load_cube(fname,
                      iris.Constraint('air_potential_temperature',
                                      model_level_number=1))
print(cube.summary(True))

The slices method returns all the slices of a cube on the dimensions specified by the coordinates passed to the slices method.

So in this example, each grid_latitude / grid_longitude slice of the cube is returned:


In [ ]:
for subcube in cube.slices(['grid_latitude', 'grid_longitude']):
    print(subcube.summary(shorten=True))

We can use slices_over to return one subcube for each coordinate value in the specified coordinate. This helps us when trying to retrieve all the slices along a given cube dimension.

For example, to slice over the time dimension we would do the following:


In [ ]:
for subcube in cube.slices_over('time'):
    print(subcube.summary(shorten=True))

Notice how the above two examples returned the same results. Starting with a time/grid_latitude/grid_longitude, to retrieve all the slices over the time dimension, with slices_over we specify only the time dimension, whereas when using slices we specify all the cube's dimensions except the time dimension.

Exercise:

Load in the `air_potential_temperature` cube from the file in iris.sample_data_path('uk_hires.pp').
Iterate over this cube such that you print out subcubes of the following dimensionality:
air_potential_temperature / (K) (time: 3; grid_latitude: 204; grid_longitude: 187)
Do this first using slices then again using slices_over.


In [ ]:
#
# edit space for user code ...
#

In [ ]:
# SAMPLE SOLUTION
# %load  solutions/iris_exercise_3.3a

3.4 Section Review Exercise

1. Load the sea_water_potential_temperature cube from the file in iris.sample_data_path('atlantic_profiles.nc'). Store these cubes in a variable called cube.


In [ ]:
# EDIT for user code ...

In [ ]:
# SAMPLE SOLUTION :  Un-comment and execute the following to see a possible solution ...

# %load solutions/iris_exercise_3.4a
  1. Extract the data with a latitude below -3° Store the result in a new variable.

In [ ]:
# user code ...

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_3.4b
  1. Iterate through the first 5 depth levels, printing the mean data value for each subcube slice.

In [ ]:
# user code ...

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_3.4c

3.5 Section Summary : Subcube Extraction

In this section we learnt:

  • cubes can be indexed like numpy arrays to produce sub-cubes
  • 'constraint' objects can be used to load only part of the data
  • particular methods are used to extract data by dates and times
  • a cube can be "sliced up" along some of its dimensions, looping over all the possible subcube 'slices'.