In [ ]:
%matplotlib inline

Iris introduction course

1. The Iris Cube

Learning Outcome: by the end of this section, you will be able to explain the capabilities and functionality of Iris Cubes and Coordinates.

Duration: 1 hour

Overview:
1.1 Introduction to the Iris Cube
1.2 Working with a Cube
1.3 Cube Attributes
1.4 Coordinates
1.5 Exercise
1.6 Summary of the Section


In [ ]:
import iris

1.1 Introduction to the Iris Cube

The top level object in Iris is called a Cube. A Cube contains data and metadata about a single phenomenon and is an implementation of the data model interpreted from the Climate and Forecast (CF) Metadata Conventions.

Each cube has:

  • A data array (typically a NumPy array).
  • A "name", preferably a CF "standard name" to describe the phenomenon that the cube represents.
  • A collection of coordinates to describe each of the dimensions of the data array. These coordinates are split into two types:
    • Dimension Coordinates are numeric, monotonic and represent a single dimension of the data array. There may be only one Dimension Coordinate per data dimension.
    • Auxilliary Coordinates can be of any type, including discrete values such as strings, and may represent more than one data dimension.

A fuller explanation is available in the Iris user guide.

Let's take a simple example to demonstrate the Cube concept.

Suppose we have a (3, 2, 4) NumPy array:

Where dimensions 0, 1, and 2 have lengths 3, 2 and 4 respectively.

The Iris Cube to represent this data may consist of:

  • a standard name of "air_temperature" and units of "kelvin"

  • a data array of shape (3, 2, 4)

  • a coordinate, mapping to dimension 0, consisting of:

    • a standard name of "height" and units of "meters"
    • an array of length 3 representing the 3 height points
  • a coordinate, mapping to dimension 1, consisting of:

    • a standard name of "latitude" and units of "degrees"
    • an array of length 2 representing the 2 latitude points
    • a coordinate system such that the latitude points could be fully located on the globe
  • a coordinate, mapping to dimension 2, consisting of:

    • a standard name of "longitude" and units of "degrees"
    • an array of length 4 representing the 4 longitude points
    • a coordinate system such that the longitude points could be fully located on the globe

Pictorially the Cube has taken on more information than a simple array:


1.2 Working with a Cube

To load in a Cube from a file, we make use of the iris.load function.

Exercise:

Take a look at the above link to see how `iris.load` is called.

For the purpose of this course, we will be using the sample data provided with Iris. We use the utility function iris.sample_data_path which returns the filepath of where the sample data is installed. We assign the output filepath returned by the iris.sample_data_path function to a variable called fname.


In [ ]:
fname = iris.sample_data_path('space_weather.nc')
Exercise:

Try printing fname, to see where the sample data is installed on your system.


In [ ]:
#
# edit space for user code ...
#

We load in the filepath fname with iris.load.


In [ ]:
cubes = iris.load(fname)
print(cubes)

iris.load returns an iris.cube.CubeList of all the cubes found in the file. From the above print out, we can see that we have loaded two cubes from the file, one representing the "total electron content" and the other representing "electron density". We can infer further detail about the returned cubes from this printout, such as the units, dimensions and shape.

Exercise:

What are the dimensions of the "total electron content" cube?
What are the units of the "electron_density" cube?


In [ ]:
#
# edit space for user notes ...
#

SAMPLE SOLUTION:
Un-comment and execute the following, to view a possible solution, and some code.
Then run it ...


In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_1.2a

To see more detail about a specific cube, we can print out a single cube from the cubelist. We can select the second cube in the cubelist with indexing, and then print out what it returns.


In [ ]:
air_pot_temp = cubes[1]
print(air_pot_temp)

As before, we have an overview of the cube's dimensions as well as the cube's name and units. We also have further detail on the cube's metadata, such as the Dimension Coordinates, Auxiliary Coordinates and Attributes.

In the printout, the dimension marker 'x' shows which dimensions apply to each coordinate. For example, we can see that the latitude Auxiliary Coordinate varies along the grid_latitude and grid_longitude dimensions.

Whilst the printout of a cube gives a nice overview of the cube's metadata, we can dig deeper by inspecting the attributes of our cube object, as covered in the next section.


1.3 Cube Attributes

We load in a different file (using the iris.sample_data_path utility function, as before, to give us the path of the file) and index out the first cube from the cubelist that is returned.


In [ ]:
fname = iris.sample_data_path('A1B_north_america.nc')
cubes = iris.load(fname)
cube = cubes[0]
print(cube)

We can see that we have loaded and selected an air_temperature cube with time, latitude and longitude dimensions and the associated Dimension coordinates. We also have a forecast_period Auxiliary coordinate which maps the time dimension. Our cube also has two scalar coordinates: forecast_reference_time and height, and a cell method of mean: time (6 hour) which means that the cube contains 6-hourly mean air temperatures.

To access the values of air temperature in the cube we use the data property. This is either a NumPy array or, in some cases, a NumPy masked array. It is very important to note that for most of the supported filetypes in Iris, the cube's data isn't actually loaded until you request it via this property (either directly or indirectly). After you've accessed the data once, it is stored on the cube and thus won't be loaded from disk again.

To find the shape of a cube's data it is possible to call cube.data.shape or cube.data.ndim, but this will trigger any unloaded data to be loaded. Therefore shape and ndim are properties available directly on the cube that do not unnecessarily load data.


In [ ]:
print(cube.shape)
print(cube.ndim)
print(type(cube.data))
Exercise:

From the above output we can see that cube.data is a masked numpy array.
How would you find out the fill value of this masked array?


In [ ]:
#
# edit space for user code ...
#

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_1.3a

The standard_name, long_name and to an extent var_name are all attributes to describe the phenomenon that the cube represents. The name() method is a convenience that looks at the name attributes in the order they are listed above, returning the first non-empty string.


In [ ]:
print(cube.standard_name)
print(cube.long_name)
print(cube.var_name)
print(cube.name())

standard_name is restricted to be a CF standard name (see the CF standard name table).

If there is not a suitable CF standard name, cube.standard name is set to None and the long_name is used instead.
long_name is less restrictive and can be set to be any string.

var_name is the name of a netCDF file variable in the input file, or to be used in output. This is normally unimportant, as CF data is identified by 'standard_name' instead : (Note: although they are often the same, some standard names are not valid as netCDF variable names).

To rename a cube, it is possible to set the attributes manually, but it is generally easier to use the rename() method.

Below we rename the cube to a string that we know is not a valid CF standard name.


In [ ]:
cube.rename("A name that isn't a valid CF standard name")

In [ ]:
print(cube.standard_name)
print(cube.long_name)
print(cube.var_name)
print(cube.name())

When renaming a cube, Iris will initally try to set cube.standard_name.
If the name is not a standard name, cube.long_name is set instead.

Exercise:

Take a look at the CF standard name table and try renaming the cube to an accepted name.


In [ ]:
#
# edit space for user code ...
#

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_1.3b

The units attribute on a cube tells us the units of the numbers held in the data array.


In [ ]:
print(cube.units)
print(cube.data.max())

We can convert the cube to another unit using the convert_units method, which will automatically update the data array.


In [ ]:
cube.convert_units('Celsius')
print(cube.units)
print(cube.data.max())

A cube also has a dictionary for extra general purpose attributes, which can be accessed with the cube.attributes attribute:


In [ ]:
print(cube.attributes)
Exercise:

Update the `cube.attributes` dictionary with a new entry.
For example {'comment':'Original data had units of degrees celsius'}.


In [ ]:
#
# edit space for user code ...
#

In [ ]:
# SAMPLE SOLUTION
# %load  solutions/iris_exercise_1.3c

1.4 Coordinates

As we've seen, cubes need coordinate information to help us describe the underlying phenomenon. Typically a cube's coordinates are accessed with the coords or coord methods. The latter must return exactly one coordinate for the given parameter filters, where the former returns a list of matching coordinates, possibly of length 0.

For example, to access the time coordinate, and print the first 4 times:


In [ ]:
time = cube.coord('time')
print(time[:4])

The coordinate interface is very similar to that of a cube. The attributes that exist on both cubes and coordinates are: standard_name, long_name, var_name, units, attributes and shape. Similarly, the name(), rename() and convert_units() methods also exist on a coordinate.


A coordinate does not have data, instead it has points and bounds (bounds may be None). In Iris, time coordinates are currently represented as "a number since an epoch":


In [ ]:
print(repr(time.units))
print(time.points[:4])
print(time.bounds[:4])

These numbers can be converted to datetime objects with the unit's num2date method. Dates can be converted back again with the date2num method:


In [ ]:
import datetime

print(time.units.num2date(time.points[:4]))
print(time.units.date2num(datetime.datetime(1970, 2, 1)))

Another important attribute on a coordinate is its coordinate system. Coordinate systems may be None for trivial coordinates, but particularly for spatial coordinates, they may be complex definitions of things such as the projection, ellipse and/or datum.


In [ ]:
lat = cube.coord('latitude')
print(lat.coord_system)

In this case, the latitude's coordinate system is a simple geographic latitude on a spherical globe of radius 6371229 (meters).


1.5 Section Review Exercise

1. Load the file in iris.sample_data_path('atlantic_profiles.nc') and print the cube list. Store these cubes in a variable called cubes.


In [ ]:
# EDIT for user code ...

In [ ]:
# SAMPLE SOLUTION :  Un-comment and execute the following to see a possible solution ...

# %load solutions/iris_exercise_1.5a

2. Loop through each of the cubes (e.g. for cube in cubes) and print the standard name of each.


In [ ]:
# user code ...

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_1.5b

3. Index cubes to retrieve the sea_water_potential_temperature cube. Note: that indexing to extract single cubes is useful for EDA, but it is better practice to use constraints (See 3. Cube Control and Subsetting.ipynb for more information).


In [ ]:
# user code ...

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_1.5c

4. Get hold of the latitude coordinate on the sea_water_potential_temperature cube. Identify whether this coordinate has bounds. Print the minimum and maximum latitude points in the cube.


In [ ]:
# user code ...

In [ ]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_1.5d

1.6 Summary of Section: The iris Cube

In this section we learnt:

  • An iris cube, which contains data and metadata, is based on the cf data model, containing dimension and auxiliary coordinates.
  • Printing out a cube gives an overview of its metadata. We can get more information on the cube by inspecting its attributes (e.g. cube.standard_name, cube.units())
  • A coordinate has a similar interface to a cube, but a coordinate has points and bounds, where a cube has data.