In [2]:
import warnings
warnings.filterwarnings('ignore')

import glob
import xarray as xr

In [3]:
infiles = glob.glob('/g/data/ua6/DRSv3/CMIP5/CCSM4/historical/mon/ocean/r1i1p1/thetao/latest/thetao_Omon_CCSM4_historical_r1i1p1_??????-??????.nc')
infiles.sort()

In [4]:
df = xr.open_mfdataset(infiles)

In [5]:
df.thetao


Out[5]:
<xarray.DataArray 'thetao' (time: 1872, lev: 60, j: 384, i: 320)>
dask.array<shape=(1872, 60, 384, 320), dtype=float32, chunksize=(120, 60, 384, 320)>
Coordinates:
  * lev      (lev) float64 5.0 15.0 25.0 35.0 ... 4.875e+03 5.125e+03 5.375e+03
  * j        (j) int32 1 2 3 4 5 6 7 8 9 ... 376 377 378 379 380 381 382 383 384
  * i        (i) int32 1 2 3 4 5 6 7 8 9 ... 312 313 314 315 316 317 318 319 320
    lat      (j, i) float32 -79.22052 -79.22052 -79.22052 ... 72.18933 72.185974
    lon      (j, i) float32 320.5625 321.6875 322.8125 ... 319.35068 319.7835
  * time     (time) datetime64[ns] 1850-01-16T12:00:00 ... 2005-12-16T12:00:00
Attributes:
    standard_name:     sea_water_potential_temperature
    long_name:         Sea Water Potential Temperature
    units:             K
    original_name:     TEMP
    comment:           TEMP no change, units from C to K
    original_units:    degC
    history:           2012-02-06T00:12:02Z altered by CMOR: Converted units ...
    cell_methods:      time: mean (interval: 30 days)
    cell_measures:     area: areacello volume: volcello
    associated_files:  baseURL: http://cmip-pcmdi.llnl.gov/CMIP5/dataLocation...

In [6]:
thetao_annual = df.thetao.groupby('time.year').mean(dim='time')

In [7]:
thetao_annual


Out[7]:
<xarray.DataArray 'thetao' (year: 156, lev: 60, j: 384, i: 320)>
dask.array<shape=(156, 60, 384, 320), dtype=float32, chunksize=(1, 60, 384, 320)>
Coordinates:
  * lev      (lev) float64 5.0 15.0 25.0 35.0 ... 4.875e+03 5.125e+03 5.375e+03
  * j        (j) int32 1 2 3 4 5 6 7 8 9 ... 376 377 378 379 380 381 382 383 384
  * i        (i) int32 1 2 3 4 5 6 7 8 9 ... 312 313 314 315 316 317 318 319 320
    lat      (j, i) float32 -79.22052 -79.22052 -79.22052 ... 72.18933 72.185974
    lon      (j, i) float32 320.5625 321.6875 322.8125 ... 319.35068 319.7835
  * year     (year) int64 1850 1851 1852 1853 1854 ... 2001 2002 2003 2004 2005

In [8]:
test = thetao_annual.values

In [9]:
test.shape


Out[9]:
(156, 60, 384, 320)

I think xarray is automatically using dask under the hood here, which is why it's able to perform this task.


In [ ]: