Imports:
In [1]:
import sys
sys.path.append("../")
from netCDF4 import Dataset, MFDataset
import pyfesom as pf
import numpy as np
from mpl_toolkits.basemap import Basemap
import matplotlib.pylab as plt
import numpy as np
%matplotlib inline
from matplotlib import cm
import xarray as xr
import pandas as pd
Loading mesh. This mesh is rotated, so we use default values. If your mesh is rotated don't forget to use abg parameter.
In [2]:
meshpath ='/mnt/lustre01/work/ab0995/a270088/data/core_mesh/'
mesh = pf.load_mesh(meshpath, usepickle=False, usejoblib=True)
Open multiple files at once. Please have a look at this page to understand what chinks are for.
In [7]:
data = xr.open_mfdataset('/work/ab0995/a270067/fesom_echam/core/cpl_output_02/fesom.200?.oce.mean.nc',
chunks={'time': 12})
Now you have a Dataset that have all the data in.
In [11]:
data
Out[11]:
You can see that in this version of fesom output there is a bug with shifter time stemps (times starts from '2000-02-01'). We are giong to fix it. Create time stamps with pandas:
In [13]:
dates = pd.date_range('2000','2010', freq='M', )
dates
Out[13]:
Note, that you have to put one year more in this case since the right boundary is not included. Now replace time stamps in the data by the right ones:
In [14]:
data.time.data = dates
In [15]:
data
Out[15]:
Good, we now have right time stamps and can work with time.
The time mean over the whole time period is simple:
In [16]:
temp_mean = data.temp.mean(dim='time')
Here we select temp variable and apply mean to it. You also have to specify the dimention (dim) that you want to make a mean over. You probably noticed that "computation" was performed very quickly. This is because there were now computation at all, just preparation for it. To actually do computation do:
In [17]:
temp_mean = temp_mean.compute()
In [18]:
temp_mean
Out[18]:
One can use slices to select data over some time period:
In [21]:
data.temp.sel(time=slice('2000-01-01', '2003-12-31')).time
Out[21]:
Mean over this slice will look like this:
In [22]:
temp_mean_3years = data.temp.sel(time=slice('2000-01-01', '2003-12-31')).mean(dim='time')
temp_mean_3years = temp_mean_3years.compute()
Our data are monthly:
In [26]:
data.time[:14]
Out[26]:
The sel allows to provide explicit time steps, so if we just select only March values:
In [27]:
data.time[2::12]
Out[27]:
we can provide this values directly to sel. We also make a mean over the selected time and do the computation:
In [29]:
temp_march_mean = data.temp.sel(time=data.time[2::12]).mean(dim='time')
temp_march_mean = temp_march_mean.compute()
The xarray have more explicit syntax to select months (returns array that show which record in your array corespond to each month):
In [30]:
data['time.month']
Out[30]:
Using this sysntax you can easily select March:
In [32]:
data.temp[data['time.month']==3]
Out[32]:
And make a mean over this month:
In [33]:
temp_march_mean = data.temp[data['time.month']==3].mean(dim='time')
temp_march_mean = temp_march_mean.compute()
Please have a look at this page to see what "datetime components" are supported. At the time of this writing the list contains: “year”, “month”, “day”, “hour”, “minute”, “second”, “dayofyear”, “week”, “dayofweek”, “weekday” and “quarter”. Additional xarray component is season. You can select winter temperature values and average over them by:
In [35]:
temp_DJF_mean = data.temp[data['time.season']=='DJF'].mean(dim='time')
temp_DJF_mean = temp_DJF_mean.compute()
Once again, please read this page to get more information. If we would like to resample our data, making yearly means, the way to do it is:
In [40]:
yearly_data = data.resample(time='1A').mean(dim='time')
In [37]:
yearly_data
Out[37]:
In [38]:
yearly_data = yearly_data.compute()
Complete list of frequencies can be found here. Most important for us are:
A - year
M - month
D - day
H - hour
In [ ]: