In this session you will learn some of the functions available in SciPy package.
General purpose scientific library (that consist of bunch of sublibraries) and builds on NumPy arrays. It consists of several submodules:
scipy.special
)scipy.integrate
)scipy.optimize
)scipy.interpolate
)scipy.fftpack
)scipy.signal
)scipy.linalg
)scipy.sparse
)scipy.stats
)scipy.ndimage
)scipy.io
)See full documentation.
In the first exercise you are blending numpy, scipy and matplotlib together.
In [1]:
# import os
# import numpy as np
# import matplotlib.pyplot as plt
# from scipy import interpolate # import submodule for interpolation and regridding
## make figures appear within the notebook
# %matplotlib inline
Define a file name:
In [2]:
## code is ready here, just uncomment
#
# file_name = os.path.join(os.path.pardir, 'data',
# 'surface_synoptic_20161109_0000_lon_lat_temperature_17097.bin')
Now use numpy.fromfile()
function to read data.
Side note: If it was an unformatted sequential binary file from Fortran code, you could have used scipy.io.FortranFile
.
Important:
In [3]:
# nrec = 17097
In [4]:
# bin_data = np.fromfile( ..., dtype=...)
Aside
Just as in Fortran, you can use direct access way of reading unformatted binary files with a given length of records. For example:
nx = 123 # points in longitude
ny = 456 # points in latitude
with open(my_file_name, 'rb') as f:
f.seek(4 * ny * nx) # Move nx*ny float32 records forward (skip the first slice)
arr = np.fromfile(f, dtype=np.float32, count=ny*nx) # Read the next slice of data
bin_data
. Is this what you expected?
In [5]:
# lons = bin_data[:nrec]
# lats =
# temp =
You can now quickly plot the data, for example using matplotlib.pyplot.scatter
function. Otherwise, skip the next cell.
In [6]:
# plt.scatter(lons, lats, c=temp)
The next step is to grid the observations onto a rectangular grid. Due to data sparsity over some regions, you can create a smaller sample first, e.g. just over Europe. You can loosely define the Europe region within 10W and 60E longitude and within 35N and 70N latitude. Then you can create a "condition" to subset the original data.
In [7]:
# sub_idx = (lons > ...) & (lons < 60) & (lats > ...) & (lats < ...) # indices over "Europe"
Check that subset works.
In [8]:
# print('Original shape: {}'.format(lons.shape))
# print('Subset shape: {}'.format(lons[sub_idx].shape))
In [9]:
# lon_step =
# lat_step =
# lon1d = np.arange(-10, 60.1, lon_step)
# lat1d = np.arange(35, 70.1, lat_step)
# lon2d, lat2d = np.meshgrid(lon1d, lat1d)
scipy.interpolate.griddata
function.sub_idx
! (Hint: it should go to xpoints, ypoints and values)scipy.interpolate
submodule for other interpolation options.
In [10]:
# temp_grd_lin = interpolate.griddata((xpoints, ypoints), values, xi, method='linear')[source]
In [11]:
# temp_grd_near = interpolate.griddata( ... )
# temp_grd_cub = interpolate.griddata( ... )
Enough boring interpolation, time to plot what you got so far.
scatter()
functioncontourf()
or pcolormesh()
to plot gridded datacolorbar
to each subplot, if you like
In [12]:
# fig, axs = plt.subplots(figsize=(15, 10), nrows=2, ncols=2) #
# print('2x2 subplot grid: {}'.format(axs.shape))
#
# h = axs[0, 0].scatter(lons[sub_idx], lats[sub_idx], c=temp[sub_idx],
# edgecolors='none', vmin=..., vmax=...)
# fig.colorbar(h, ax=axs[0, 0])
# axs[0, 0].set_title('Observations')
#
#
#
# h = axs[0, 1].pcolormesh(lon2d, lat2d, temp_grd_lin,
# vmin=..., vmax=...)
# fig.colorbar(h, ax=axs[0, 1])
# axs[0, 0].set_title('Linear')
#
#
#
# h = axs[1, 0] ... ( ... temp_grd_near)
#
#
# axs[1, 0].set_title('Nearest')
#
#
#
# axs[1, 1].set_title('Cubic')