In [2]:
%load_ext load_style
%load_style talk.css

In [3]:
%matplotlib inline

EOF decomposition of SST anomalies in the Pacific with the eofs library

eofs is a Python library developed by Andrew Dawson which provides a very convenient interface for performing EOF decomposition of geophysical (ocean / atmosphere) fields

It has 3 interfaces:

  • standard: expects numpy arrays as inputs
  • cdms: cdms2 objects (cdms2 is a class for opening netcdf files [amongst other formats]) part of the cdat-lite package or UV-CDAT distribution)
  • iris: iris cubes

We are going to use xray here, and thus use the standard interface, passing the underlying numpy arrays

import the eof solver, standard interface

In [4]:
from eofs.standard import Eof

usual imports

In [5]:
import os, sys
import pandas as pd
import numpy as np
from numpy import ma
from matplotlib import pyplot as plt
from mpl_toolkits.basemap import Basemap as bm

In [6]:
dpath = os.path.join(os.environ['HOME'],'data/SST/ER_SST/V4')
#dpath = os.path.join(os.environ['HOME'],'data/SST/ER_SST/')

defines a function to plot a 2D field map

In [7]:
def plot_field(X, lat, lon, vmin, vmax, step, cmap=plt.get_cmap('jet'), ax=False, title=False, grid=False):
    if not ax: 
        f, ax = plt.subplots(figsize=(10, (X.shape[0] / float(X.shape[1])) * 10)) = ax
    im = m.contourf(lons, lats, X, np.arange(vmin, vmax+step, step), latlon=True, cmap=cmap, extend='both', ax=ax)
    if grid: 
        m.drawmeridians(np.arange(0, 360, 60), labels=[0,0,0,1])
        m.drawparallels([-40, 0, 40], labels=[1,0,0,0])
    if title: 

load the SST data using xray

In [8]:
import xray; print(xray.__version__)


In [9]:
ncfname = os.path.join(dpath,'')

In [10]:
dset = xray.open_dataset(ncfname)

In [12]:

Dimensions:   (lat: 89, lon: 180, nv: 2, time: 804, zlev: 1)
  * time      (time) datetime64[ns] 1948-01-15 1948-02-15 1948-03-15 1948-04-15 ...
  * zlev      (zlev) float32 0.0
  * lat       (lat) float32 -88.0 -86.0 -84.0 -82.0 -80.0 -78.0 -76.0 -74.0 -72.0 -70.0 ...
  * lon       (lon) float32 0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 18.0 20.0 22.0 24.0 ...
  * nv        (nv) int64 0 1
Data variables:
    lat_bnds  (lat, nv) float32 -89.0 -87.0 -87.0 -85.0 -85.0 -83.0 -83.0 -81.0 -81.0 ...
    lon_bnds  (lon, nv) float32 -1.0 1.0 1.0 3.0 3.0 5.0 5.0 7.0 7.0 9.0 9.0 11.0 11.0 ...
    sst       (time, zlev, lat, lon) float64 nan nan nan nan nan nan nan nan nan nan nan ...
    anom      (time, zlev, lat, lon) float64 nan nan nan nan nan nan nan nan nan nan nan ...
    Conventions: CF-1.6
    Metadata_Conventions: CF-1.6, Unidata Dataset Discovery v1.0
    metadata_link: C00884
    id: ersst.v4.194801
    naming_authority: gov.noaa.ncdc
    title: NOAA Extended Reconstructed Sea Surface Temperature (ERSST), Version 4 (in situ only)
    summary: ERSST.v4 is developped based on v3b after revisions of 11 parameters using updated data sets and advanced knowledge of ERSST analysis
    institution: NOAA/NESDIS/NCDC
    creator_name: Boyin Huang
    date_created: 2014-10-24
    production_version: Beta Version 4
    history: Fri Feb 13 15:18:09 2015: ncrcat ersst.19...
    publisher_name: Boyin Huang
    license: No constraints on data access or use
    time_coverage_start: 1948-01-15T000000Z
    time_coverage_end: 1948-01-15T000000Z
    geospatial_lon_min: -1.0f
    geospatial_lon_max: 359.0f
    geospatial_lat_min: -89.0f
    geospatial_lat_max: 89.0f
    geospatial_lat_units: degrees_north
    geospatial_lat_resolution: 2.0
    geospatial_lon_units: degrees_east
    geospatial_lon_resolution: 2.0
    spatial_resolution: 2.0 degree grid
    cdm_data_type: Grid
    processing_level: L4
    standard_name_vocabulary: CF Standard Name Table v27
    keywords: Earth Science &gt; Oceans &gt; Ocean Temperature &gt; Sea Surface Temperature &gt
    keywords_vocabulary: NASA Global Change Master Directory (GCMD) Science Keywords
    project: NOAA Extended Reconstructed Sea Surface Temperature (ERSST)
    platform: Ship and Buoy SSTs from ICOADS R2.5 and NCEP GTS
    instrument: Conventional thermometers
    source: ICOADS R2.5 SST, NCEP GTS SST, HadISST ice, NCEP ice
    comment: SSTs were observed by conventional thermometers in Buckets (insulated or un-insulated canvas and wooded buckets) or Engine Room Intaker
    references: Huang et al, 2014: Extended Reconstructed Sea Surface Temperatures Version 4 (ERSST.v4), Part I. Upgrades and Intercomparisons. Journal of Climate, DOI: 10.1175/JCLI-D-14-00006.1.
    climatology: Climatology is based on 1971-2000 SST, Xue, Y., T. M. Smith, and R. W. Reynolds, 2003: Interdecadal changes of 30-yr SST normals during 1871.2000. Journal of Climate, 16, 1601-1612.
    description: In situ data: ICOADS2.5 before 2007 and NCEP in situ data from 2008 to present. Ice data: HadISST ice before 2010 and NCEP ice after 2010.
    nco_openmp_thread_number: 1

selects the period 1980 - 2014 and the tropical Pacific domain

In [13]:
dsub = dset.sel(time=slice('1980','2014'), lat=slice(-40,40), lon=slice(120,290))

get the lat, lon and sst variables as numpy arrays

In [12]:
lat = dsub['lat'].values
lon = dsub['lon'].values
sst = dsub['anom'].values.squeeze() # because of zlev

this is for plotting later

In [13]:
lons, lats = np.meshgrid(lon, lat)

sst.shape = (time, latitudes, longitudes)

In [14]:

(420, 41, 86)

defines an array of weights

In [15]:
coslat = np.cos(np.deg2rad(lat))
wgts = np.sqrt(coslat)[..., np.newaxis]

instantiates the EOF solver, passing the weights array as an optional argument

note how one doesnt need to deal with either the dimensions (stays 3D) or the missing data: the EOF solver deals with all that intelligently !

In [16]:
solver = Eof(sst, weights=wgts)

get the EOF (spatial patterns) as correlations

In [17]:
eof1 = solver.eofsAsCorrelation(neofs=1)

get the Principal Components (PCs), normalised values

In [18]:
pc1 = solver.pcs(npcs=1, pcscaling=1)


In [19]:

(1, 41, 86)

In [20]:
m = bm(projection='cyl',llcrnrlat=-40,urcrnrlat=40,\

In [21]:
plot_field(eof1.squeeze(), lats, lons, -1, 1, 0.1, cmap=plt.get_cmap('RdBu_r'), \
               title="EOF #1", grid=True)

In [22]:
pc1 = pd.Series(pc1.squeeze(), index=dsub['time'].values)

In [23]:

<matplotlib.axes._subplots.AxesSubplot at 0x10c693cd0>