3. Xarray & PyNIO

To read and write scientific data xarray and PyNIO are very efficient, and easy to use because the internal data structure is netCDF like. Xarray can read netCDF and Grib files, and handle the metadata following the netCDF CF-convention. The same is true for PyNIO, that can additionally read HDF and WRF files.

See also http://xarray.pydata.org and https://github.com/NCAR/pynio.

The examples below shows the use of xarray and PyNIO software to read data from file, work with coordinates and metadata.

3.1 Read a netCDF file

We want to start directly with opening and reading the netCDF file tsurf.nc from the subdirectory data.

Import the common used modules and define the variable fname (file name).


In [ ]:
import numpy as np

fname = './data/tsurf.nc'

1. xarray

First we have to load the xarray module, and because we are too lazy, we want to use the abbreviation xr for it.

The function **xr.open_dataset** of xarray is used to read the content of the file.

The variable name ds is often used and is the abbreviation of dataset.


In [ ]:
import xarray as xr

ds = xr.open_dataset(fname)

print(ds)

Printing the dataset content gives you an overview of the dimension and variable names, their sizes, and the global file attributes.


2. PyNIO

Like above, we have to import the module first, but this time it's Nio (that's short enough).

PyNIO's function to read the file is **Nio.open_file**.

The name f of the file object is often used in NCL scripts, that's why we use it here as well, but you can call it what ever you want.


In [ ]:
import Nio

f =  Nio.open_file(fname,"r")

print(f)

This is very similar to the ncdump output, and corresponds to the output from xarray.


3.2 Show variable names and coordinates

It is always good to have a closer look at your data, and this can be done very easily with xarray and PyNIO.

Ok, show me the variables stored in that file (ups - just one :D) and the coordinate variables, too.

1. xarray


In [ ]:
coords    = ds.coords
variables = ds.variables

print('--> coords:    \n\n', coords)
print('--> variables: \n\n', variables)

Ah, that's better. Here we can see the time displayed in a readable way, because xarray use the datetime64 module under the hood. Also the variable and coordinate attributes are shown.


2. PyNIO

Let us see how PyNIO will do that.


In [ ]:
coords_nio    = f.dimensions.keys()
variables_nio = f.variables.keys()

print(coords_nio)
print(variables_nio)

#print f.variables['varName']

In [ ]:
coord_nio = f.dimensions.keys()
varNames  = f.variables.keys()

for i in varNames:
    print(f.variables[i])
    print(f.variables[i][:])

3.3 Select variable and coordinate variables

At the moment, we only have created a dataset respectively a file object containing the coordinate variables and variable data. Now, we want to select the variable tsurf and the coordinate variables lat and lon.

1. xarray


In [ ]:
tsurf = ds.tsurf
lat   = tsurf.lat
lon   = tsurf.lon

print('Variable tsurf: \n', tsurf.data)
print('\nCoordinate variable lat: \n', lat.data)
print('\nCoordinate variable lon: \n', lon.data)

2. PyNIO

If you use PyNIO to open a file the handling differs a little bit. While with xarray you can retrieve the coordinate variable data from the file, PyNIO gets them from the file object.


In [ ]:
tsurf_nio = f.variables['tsurf'][:,:,:]
lat_nio   = f.variables['lat'][:]
lon_nio   = f.variables['lon'][:]

print('Variable tsurf_nio: \n', tsurf_nio)
print('\nCoordinate variable lat_nio: \n', lat_nio)
print('\nCoordinate variable lon_nio: \n', lon_nio)

The variables have different data types:

  • xarray gets the variable object data into a special data array which is called DataArray.
  • PyNIO gets the variable object data into a numpy ndarray.

In [ ]:
print(type(tsurf))
print(type(tsurf_nio))

3.3 Dimensions, shape and size

To get more informations about the dimension, shape and size of a variable we can use the approbriate attributes.

1. xarray


In [ ]:
dimensions = ds.dims
shape = tsurf.shape
size  = tsurf.size
rank  = len(shape)

print('dimensions: ', dimensions)
print('shape:      ', shape)
print('size:       ', size)
print('rank:       ', rank)

2. PyNIO


In [ ]:
dimensions_nio = f.dimensions
shape_nio = tsurf_nio.shape
size_nio  = tsurf_nio.size
rank_nio  = len(shape_nio)   # or rank_nio = f.variables["tsurf"].rank

print('dimensions: ', dimensions_nio)
print('shape:      ', shape_nio)
print('size:       ', size_nio)
print('rank_nio:   ', rank_nio)

3.4 Variable attributes

Variable attributes are very important to work in a correct manor with the data.

1. xarray


In [ ]:
attributes = list(tsurf.attrs)

print('attributes: ', attributes)

2. PyNIO

To get the attributes we have to use the file variable object f.variables['tsurf'] and not the numpy array tsurf_nio.


In [ ]:
attributes_nio = list(f.variables['tsurf'].attributes.keys())

print('attributes_nio: ', attributes_nio)

Let's see how we can get the content of an attribute.

1. xarray


In [ ]:
long_name = tsurf.long_name
units = tsurf.units

print('long_name: ', long_name)
print('units:     ', units)

2. PyNIO

And here we have to use the file variable object f.variables['tsurf'] again.


In [ ]:
long_name_nio = f.variables["tsurf"].attributes['long_name']
units_nio = f.variables["tsurf"].attributes['units']

print('long_name_nio: ', long_name_nio)
print('units_nio:     ', units_nio)

3.5 Time

Xarray and PyNIO are working with times totally diffent. Xarray is able to convert the time values to readable times using the internally datetime64 module. While PyNIO only depicts the numeric values of the coordinate variable time.

1. xarray


In [ ]:
time = ds.time.data

print('timestep 0: ', time[0])

2. PyNIO


In [ ]:
time_nio =  f.variables['time'][:]

print('timestep 0: ', time_nio[0])

The returned time value is the value stored in the netCDF file and it has to be converted to a date string. To convert the time value to a string like xarray's above, the units and the calendar attribute have to be known. In this example, we use the netCDF4 module to convert the time values.


In [ ]:
import netCDF4

time_nio_units    = f.variables["time"].attributes['units']
time_nio_calendar = f.variables["time"].attributes['calendar']

date_nio = netCDF4.num2date(time_nio[0], units=time_nio_units, calendar=time_nio_calendar)

print('timestep 0: ', date_nio)

3.6 Read a GRIB file

To read a GRIB file nothing has to be done for PyNIO (except to change the file name) but xarray needs an additional module cfgrib, which is used as an so called engine.

1. xarray


In [ ]:
import cfgrib

ds2 = xr.open_dataset('./data/MET9_IR108_cosmode_0909210000.grb2', engine='cfgrib')

variables2 = ds2.variables

print('--> variables2: \n\n', variables2)

2. PyNIO


In [ ]:
f2 =  Nio.open_file('./data/MET9_IR108_cosmode_0909210000.grb2',"r")

variables_nio2 = f2.variables.keys()

for i in variables_nio2:
    print(f2.variables[i])
    print(f2.variables[i][:])

In [ ]: