<img src="http://nci.org.au/wp-content/themes/nci/img/img-logo-large.png", width=400>


Data Access and Manipulation Using iPython Notebooks

OFAM and Himawari 8

In this notebook:

The following will go through how to: <br >

  1. Access netCDF data locally from /g/data
  2. Extract/view data
  3. Combine data from different datasets

The following material uses the CSIRO Ocean Forcasting Australian Model (OFAM) and the Bureau of Meteorology Himawari 8 Data Collections. For more information on the collection and licensing, please click here for OFAM and here for Himawari 8.



Prerequisites

A Python 3 environment

Setup instructions for python 3 virtual environments can be found here.

Load the python, ipython, and netcdf4-python modules:

    $ module load python/2.7.11
    $ module load python/2.7.11-matplotlib
    $ module load ipython/4.2.0-py2.7
    $ module load netcdf4-python/1.2.4-ncdf-4.3.3.1-py2.7


Launch the Jupyter Notebook application:

    $ jupyter notebook
NOTE: This will launch the Notebook Dashboard within a new web browser window.


Using virtual environments:

To use along with customised python packages in a virtual environment, begin by following the steps in Python on the VDI: Part II.

Once you have a virtual environment setup with your packages (including Jupyter), proceed by loading the required modules and activating the virtual environment:

    $ module load python/2.7.11
    $ module load proj/4.8.0
    $ module load geos/3.5.0
    $ source <path_to_virtual_environment>/bin/activate


Then, as above, launch the Jupyter Notebook application:

    $ jupyter notebook
NOTE: If you have already followed Python on the VDI: Part II, you should have created a virtual environemtn and installed all the required packages for this example.


Import python libraries


In [1]:
from netCDF4 import Dataset
import matplotlib.pyplot as plt 
import numpy as np
%matplotlib inline

Open dataset


In [2]:
ofam_path = '/g/data3/gb6/BRAN/BRAN_2015/OFAM/ocean_temp_2016_02.nc'

In [3]:
ofam = Dataset(ofam_path)

Take a look at the file contents


In [4]:
for item in ofam.dimensions:
    print ofam.dimensions[item].name, ofam.dimensions[item].size

vars = ofam.variables.keys()
for item in vars:
    print 'Variable: \t', item
    print 'Dimensions: \t', ofam[item].dimensions
    print 'Shape:    \t', ofam[item].shape, '\n'


xt_ocean 3600
yt_ocean 1500
st_ocean 51
Time 29
nv 2
st_edges_ocean 52
Variable: 	xt_ocean
Dimensions: 	(u'xt_ocean',)
Shape:    	(3600,) 

Variable: 	yt_ocean
Dimensions: 	(u'yt_ocean',)
Shape:    	(1500,) 

Variable: 	st_ocean
Dimensions: 	(u'st_ocean',)
Shape:    	(51,) 

Variable: 	Time
Dimensions: 	(u'Time',)
Shape:    	(29,) 

Variable: 	nv
Dimensions: 	(u'nv',)
Shape:    	(2,) 

Variable: 	st_edges_ocean
Dimensions: 	(u'st_edges_ocean',)
Shape:    	(52,) 

Variable: 	average_T1
Dimensions: 	(u'Time',)
Shape:    	(29,) 

Variable: 	average_T2
Dimensions: 	(u'Time',)
Shape:    	(29,) 

Variable: 	average_DT
Dimensions: 	(u'Time',)
Shape:    	(29,) 

Variable: 	Time_bounds
Dimensions: 	(u'Time', u'nv')
Shape:    	(29, 2) 

Variable: 	temp
Dimensions: 	(u'Time', u'st_ocean', u'yt_ocean', u'xt_ocean')
Shape:    	(29, 51, 1500, 3600) 

Extract and plot global data from single time and depth


In [5]:
lat = ofam.variables['yt_ocean'][:]
lon = ofam.variables['xt_ocean'][:]
T = ofam.variables['temp'][22,0,:,:]
time = ofam.variables['Time'][:]

In [6]:
plt.figure(figsize=(12,5))
plt.pcolormesh(lon, lat, T)


Out[6]:
<matplotlib.collections.QuadMesh at 0x441bb50>

Do the same for a smaller subset


In [7]:
# time slice
i = 22

T_s = ofam.variables['temp'][i, 0, 200:800, 1400:2100]
lon_s = lon[1400:2100]
lat_s = lat[200:800]

plt.figure(figsize=(12,12))
plt.pcolormesh(lon_s, lat_s, T_s)

plt.clim(vmin=18, vmax=30)


Open dataset


In [8]:
# Data paths
h8b1_path = '/g/data2/rr5/satellite/obs/himawari8/FLDK/2016/02/23/0400/20160223040000-P1S-ABOM_OBS_B01-PRJ_GEOS141_2000-HIMAWARI8-AHI.nc'
h8b2_path = '/g/data2/rr5/satellite/obs/himawari8/FLDK/2016/02/23/0400/20160223040000-P1S-ABOM_OBS_B02-PRJ_GEOS141_2000-HIMAWARI8-AHI.nc'
h8b3_path = '/g/data2/rr5/satellite/obs/himawari8/FLDK/2016/02/23/0400/20160223040000-P1S-ABOM_OBS_B03-PRJ_GEOS141_2000-HIMAWARI8-AHI.nc'

# Open data
h8b1 = Dataset(h8b1_path)
h8b2 = Dataset(h8b2_path)
h8b3 = Dataset(h8b3_path)

Take a look at the file contents


In [9]:
for item in h8b1.dimensions:
    print h8b1.dimensions[item].name,  h8b1.dimensions[item].size

vars = h8b1.variables.keys()
for item in vars:
    print 'Variable: \t', item
    print 'Dimensions: \t', h8b1[item].dimensions
    print 'Shape:    \t', h8b1[item].shape, '\n'


time 1
y 5500
x 5500
Variable: 	time
Dimensions: 	(u'time',)
Shape:    	(1,) 

Variable: 	y
Dimensions: 	(u'y',)
Shape:    	(5500,) 

Variable: 	x
Dimensions: 	(u'x',)
Shape:    	(5500,) 

Variable: 	geostationary
Dimensions: 	()
Shape:    	() 

Variable: 	scan_line_time
Dimensions: 	(u'y',)
Shape:    	(5500,) 

Variable: 	channel_0001_scaled_radiance
Dimensions: 	(u'time', u'y', u'x')
Shape:    	(1, 5500, 5500) 

Extract and plot global data


In [10]:
b1 = h8b1.variables['channel_0001_scaled_radiance'][0,:,:]
b2 = h8b2.variables['channel_0002_scaled_radiance'][0,:,:]
b3 = h8b3.variables['channel_0003_scaled_radiance'][0,:,:]

x = h8b1.variables['x'][:]
y = h8b1.variables['y'][:]

In [11]:
plt.figure(figsize=(6,6))
plt.imshow(b1, extent=[x[0], x[-1], y[-1], y[0]])


Out[11]:
<matplotlib.image.AxesImage at 0x67dbb90>

Instead of looking at single band, let's make an RGB image


In [12]:
vmin = 0
vmax = .5
B1 = b1.clip(vmin, vmax) / vmax * 255
B2 = b2.clip(vmin, vmax) / vmax * 255
B3 = b3.clip(vmin, vmax) / vmax * 255

rgb = np.stack((B3, B2, B1), axis=2).astype('uint8')

In [13]:
# Plot image
plt.figure(figsize=(12,12))
plt.imshow(rgb, extent=[x[0], x[-1], y[-1], y[0]])

# Add labels
plt.title('Himawari 8 \n', fontsize=20)

# Adjust tick mark size
plt.tick_params(labelsize=16)


Do the same for a smaller subset (let's choose roughly the same region as the OFAM subset)

Note: In these examples, the subsets are specified directly by the index value but you could also query based on lat/lon values.


In [14]:
vmin = 0
vmax = .5
B1 = b1[3300:4500,3000:5000].clip(vmin, vmax) / vmax * 255
B2 = b2[3300:4500,3000:5000].clip(vmin, vmax) / vmax * 255
B3 = b3[3300:4500,3000:5000].clip(vmin, vmax) / vmax * 255

X = x[3000:5000]
Y = y[3300:4500]
rgb = np.stack((B3, B2, B1), axis=2).astype('uint8')


# Plot image
plt.figure(figsize=(12,12))
plt.imshow(rgb , extent=[X[0], X[-1], Y[-1], Y[0]])

# Add labels
plt.title('Himawari 8 \n', fontsize=20)

# Adjust tick mark size
plt.tick_params(labelsize=16)


Now let's try and plot both datasets in the same plot...

Cartopy (based on Matplotlib but includes support for different coordinate reference systems)

Note: On the VDI, the Cartopy package will have to be installed locally. Instructions at the end of this notebook if you do not already have Cartopy installed.

Let's first replot the previous Himawari subset


In [15]:
import cartopy.crs as ccrs

plt.figure(figsize=(12,12))

# Define the projection information of the image
img_proj = ccrs.Geostationary(central_longitude=h8b1['geostationary'].longitude_of_projection_origin, 
                                  satellite_height=h8b1['geostationary'].satellite_height)
# The extents of the image we are plotting
img_extent = (X[0], X[-1], Y[-1], Y[0])


# Setup the axes projection
ax = plt.axes(projection=ccrs.Miller(central_longitude=float(h8b1['geostationary'].longitude_of_projection_origin)))

# Extent of the axes in the plot
ax.set_extent([150, 190, -35, -5])

# Plot image
plt.imshow(rgb, transform=img_proj, extent=img_extent, origin='upper')


Out[15]:
<matplotlib.image.AxesImage at 0x106df9d0>

Now let's do the same for the OFAM data


In [16]:
plt.figure(figsize=(12,12))

# Setup the axes projection
ax = plt.axes(projection=ccrs.Miller(central_longitude=float(h8b1['geostationary'].longitude_of_projection_origin)))

# Define the projection information of the image
img_proj = ccrs.PlateCarree()

# The extents of the image we are plotting
img_extent = (lon_s[0], lon_s[-1], lat_s[-1], lat_s[0])

# Extent of the axes in the plot
ax.set_extent([150, 190, -35, -5])

plt.imshow(T_s, transform=img_proj, extent=img_extent, origin='upper')
plt.clim(vmin=18, vmax=30)


Now combine...


In [17]:
plt.figure(figsize=(12,12))

# Setup the axes projection
ax = plt.axes(projection=ccrs.Miller(central_longitude=float(h8b1['geostationary'].longitude_of_projection_origin)))


### OFAM ###

img_proj = ccrs.PlateCarree()
img_extent = (lon_s[0], lon_s[-1], lat_s[-1], lat_s[0])
ax.set_extent([150, 190, -35, -5])
plt.imshow(T_s, transform=img_proj, extent=img_extent, origin='upper', alpha=.75)
plt.clim(vmin=18, vmax=30)


### HIMAWARI ###
img_proj = ccrs.Geostationary(central_longitude=h8b1['geostationary'].longitude_of_projection_origin, 
                                  satellite_height=h8b1['geostationary'].satellite_height)
img_extent = (X[0], X[-1], Y[-1], Y[0])
ax.set_extent([150, 190, -35, -5])
plt.imshow(rgb, transform=img_proj, extent=img_extent, origin='upper', alpha=.5)


Out[17]:
<matplotlib.image.AxesImage at 0x4406d90>
EXTRA:
(1) Try playing around with different plotting options. For example, plotting contours instead of using "imshow".
(2) Try merging another dataset of interest with either of the ones above.

Close files


In [18]:
ofam.close()
h8b1.close()
h8b2.close()
h8b3.close()




To install Cartopy on the VDI (within a virtual envrionment):

  • First load the required modules (these will need to be loaded before activating the virtual environment each time you want to use Cartopy):

    $ module load python/2.7.11
    $ module load proj/4.8.0
    `$ module load geos/3.5.0`

  • Activate your virtual environment

    $ source <venv>/bin/activate

  • If you do not already have Cython installed, install:
$ pip install cython 


  • Install Cartopy (v0.13.0, later versions need more recent proj4 library that is not yet available on the VDI)

    $ pip install cartopy==0.13.0