In [1]:
import xarray as xr
import numpy as np
In [15]:
ds0 = xr.open_dataset("C:\\Users\\Norman\\.cate\\data_stores\\local\\local.esacci.SEALEVEL.mon.IND.MSLAMPH.multi-sensor.multi-platform.MERGED.2-0.r1.6ba656f9-7c90-3b15-aad2-c137f8b61909\\ESACCI-SEALEVEL-IND-MSLAMPH-MERGED-20161202000000-fv02.nc")
ds0
Out[15]:
In [25]:
ds0['phase'] is ds0['phase']
Out[25]:
We observe two issues here which make it hard to work with this data in the current version of Cate:
lat
and lon
, in this order;time
, which is not used by any data variable. Instead it seems, it provides the individual time steps of the data within the temporal coverage that was used to produce the dataset. If we want to concatenate multiple Sea-Level datasets to form a time series, this will later on fail, because Cate interprets the time
as an axis as defined by the CF-Conventions.
In [5]:
ds = ds0.rename(time='time_step')
ds.time_step
Out[5]:
Then we add the time
and time_bnds
coordinate variables using the xarray Dataset method assign_coords() according to the CF-Conventions.
In [6]:
ds.time_step.encoding
Out[6]:
In [7]:
# Time encoding properties
dtype = ds.time_step.encoding['dtype']
units = ds.time_step.encoding['units']
# Get the time boundary values
ts = ds.time_step.values
ts_start = ts[0]
ts_end = ts[-1]
ts_mid = ts_start + 0.5 * (ts_end - ts_start)
# New coordinate variables according to CF
time = xr.DataArray(np.array([ts_mid]), dims='time')
time_bnds = xr.DataArray(np.array([[ts_start, ts_end]]), dims=['time', 'bnds'])
# Assign coordinate variables
ds = ds.assign_coords(time=time, time_bnds=time_bnds)
# Update coordinate variable attributes according to CF
ds.time.attrs.update(bounds='time_bnds')
# Set coordinate variable encodings
ds.time.encoding.update(units=units, dtype=dtype)
ds.time_bnds.encoding.update(units=units, dtype=dtype)
ds
Out[7]:
Then we insert a time axis of size one into the data variables using the xarray DataArray method expand_dims(), so we can later easily concatenate multiple Sea-Level datasets along the time dimension.
In [35]:
ds['ampl'] = ds.ampl.expand_dims('time')
ds['phase'] = ds.phase.expand_dims('time')
ds
Out[35]:
We now address the 1st issue by changing the order of the data variables' dimensions so that Cate can display them. We use the xarray Dataset method transpose():
In [36]:
ds = ds.transpose('time', 'bnds', 'time_step', 'period', 'lat', 'lon')
ds
Out[36]:
Finally we write the modified dataset as a new NetCDF file using the xarray Dataset method to_netcdf():
In [37]:
ds.to_netcdf('ESACCI-SEALEVEL-IND-MSLAMPH-MERGED-20161202000000-fv02-Cate.nc')
In [ ]: