note : The data are available at: https://drive.google.com/open?id=0B9LDS6UoXaUoUTJ2YVdFeWpXVlk
In [1]:
import numpy as np
from scipy.io import netcdf
from matplotlib import pyplot as plt
%matplotlib inline
In [4]:
f = netcdf.netcdf_file('./data/r2_reanalysis.nc', 'r')
The file can be iterated to show all of the variables
In [5]:
for var in f.variables:
print var
Now we can extract the variables that we want to use. Using the [:]
indicates that a numpy array we be extract, otherwise an python object will be extracted
In [6]:
u = f.variables['u']
v = f.variables['v']
t = f.variables['t']
longitude = f.variables['lon'][:]
latitude = f.variables['lat'][:]
level = f.variables['level'][:]
time = f.variables['time']
print u.shape, longitude.shape, latitude.shape, level.shape, time.shape
print time.units
In [7]:
print latitude
print longitude
print level
netcdf uses a type of compression for the variables that requires to be multiplied by an scale factor (scale_factor
) and added an offset value (add_offset
). For example, the scale_factor
for the zonal wind U is 0.01 and it's offset value is 187.65. If we don't apply this changes the first element of the u
array is u[0,0,0,0] = -18573
and if we apply those changes it becomes u[0,0,0,0] = 1.92
. The module netCDF4 (http://unidata.github.io/netcdf4-python/) automagically handles those attributes saving up some time.
In [8]:
def unpack(var):
return var[:] * var.scale_factor + var.add_offset
In [9]:
u = unpack(u)
v = unpack(v)
t = unpack(t)
The np.meshgrid
returns a coordinate matrice from coordinate vector. In this example it will associate each of the longitudes in longitude
vector to each of the latitudes in latitude
and vice versa. Thereby, an 2D array will be created
In [10]:
lons, lats = np.meshgrid(longitude, latitude)
print lons.shape, lats.shape
To plot that data we can use Basemap http://matplotlib.org/basemap/users/index.html
" The matplotlib basemap toolkit is a library for plotting 2D data on maps in Python (...) Basemap is geared toward the needs of earth scientists, particular oceanographers and meteorologists (...) "
In [11]:
from mpl_toolkits.basemap import Basemap
A simple image can be plotted using the min and max values in the lats and lons variables. We can create a map object m defining the lower corner and uper corner longitude and latitude, llcrnrlon(lat) and urcrnrlon(lat). Then we can use the matplotlib methods to make the temperature and wind plot.
In [12]:
fig = plt.figure(figsize=(16,35))
m = Basemap(llcrnrlon=lons.min(), llcrnrlat=lats.min(), urcrnrlon=lons.max(), urcrnrlat=lats.max())
m.drawcoastlines()
m.drawcountries()
skip = 2
cs = m.contourf(lons, lats, t[0, 0, :, :])
qv = m.quiver(lons[::skip, ::skip], lats[::skip, ::skip], u[0, 0, ::skip, ::skip], v[0, 0, ::skip, ::skip])
There are many different projections to choose from in Basemap:
In [13]:
fig = plt.figure(figsize=(20,20))
ax1 = fig.add_subplot(121)
ax1.set_title('Orthographic Projection')
m = Basemap(projection='ortho',lat_0=-15,lon_0=300, ax=ax1)
m.drawcoastlines()
m.drawcountries()
skip = 2
x, y = m(lons, lats)
cs = m.contourf(x, y, t[0, 0, :, :])
qv = m.quiver(x[::skip, ::skip], y[::skip, ::skip], u[0, 0, ::skip, ::skip], v[0, 0, ::skip, ::skip])
###############################################################################################
ax2 = fig.add_subplot(122)
ax2.set_title('Polar Azimuthal Equidistant Projection')
m = Basemap(projection='spaeqd',boundinglat=-10,lon_0=270, ax=ax2)
m.drawcoastlines()
m.drawcountries()
skip = 2
x, y = m(lons, lats)
cs = m.contourf(x, y, t[0, 0, :, :])
qv = m.quiver(x[::skip, ::skip], y[::skip, ::skip], u[0, 0, ::skip, ::skip], v[0, 0, ::skip, ::skip])
Answer: Since my longitude and my variables are ordered from 0 to 360, only a part of the data would be plotted (the part between 0 and 180). The part between -180 and 0 which equals 180 to 360, in the data between 0 and 360 would not be plotted
In [14]:
fig = plt.figure(figsize=(16,35))
m = Basemap(llcrnrlon=-180, llcrnrlat=lats.min(), urcrnrlon=180, urcrnrlat=lats.max())
m.drawcoastlines()
m.drawcountries()
skip = 2
cs = m.contourf(lons, lats, t[0, 0, :, :])
qv = m.quiver(lons[::skip, ::skip], lats[::skip, ::skip], u[0, 0, ::skip, ::skip], v[0, 0, ::skip, ::skip])
To reorder the variables and the longitude we can create a flip_grid
function, that takes the longitude and variable to be reordered as arguments. This function will define a filter fltr
that will be True for every longitude higher than 180 and False for values below. The filter will then be used to concatenate the values between -180 to -2.5 with the values between 0 to 177.5 for my longitude vector and my variable
In [15]:
def flip_grid(var, lons):
fltr = lons >= 180
# fltr = [False False False ... True True True]
newlons = np.concatenate(((lons - 360)[fltr], lons[~fltr]), axis=-1)
# newlons = [-180 -177.5 -175 ... -5 -2.5 ] concatenated with [0 2.5 5 ... 175 177.5]
# newlons = [-180 -177.5 -175 ... 175 177.5 180]
if var.ndim == 2:
newvar = np.concatenate((var[:, fltr], var[:, ~fltr]), axis=-1)
elif var.ndim == 3:
newvar = np.concatenate((var[:, :, fltr], var[:, :, ~fltr]), axis=-1)
elif var.ndim == 4:
newvar = np.concatenate((var[:, :, :, fltr], var[:, :, :, ~fltr]), axis=-1)
return newvar, newlons
The flip_grid
is then used to reorder u, v and t variables
In [16]:
u, newlon = flip_grid(u, longitude)
v, newlon = flip_grid(v, longitude)
t, newlon = flip_grid(t, longitude)
print newlon
In [17]:
lons, lats = np.meshgrid(newlon, latitude)
In [18]:
fig = plt.figure(figsize=(16,35))
m = Basemap(llcrnrlon=lons.min(), llcrnrlat=lats.min(), urcrnrlon=lons.max(), urcrnrlat=lats.max())
m.drawcoastlines()
m.drawcountries()
skip = 2
cs = m.contourf(lons, lats, t[0, 0, :, :])
qv = m.quiver(lons[::skip, ::skip], lats[::skip, ::skip], u[0, 0, ::skip, ::skip], v[0, 0, ::skip, ::skip])
Next, a function find_nearest
is created to find the nearest grid point for a given location, based on the distance between points
In [19]:
def find_nearest(x, y, gridx, gridy):
distance = (gridx - x)**2 + (gridy - y)**2
idx = np.where(distance == distance.min())
# idx = (array([45]), array([0]))
return [idx[0][0], idx[1][0]]
Using, for example, the latitude and longitude of São Paulo (lat = -23.650000 and lon = -46.616667)
In [20]:
idx = find_nearest(-46.616667, -23.650000, lons, lats)
print lons[idx[0], idx[1]], lats[idx[0], idx[1]]
Selecting the index idx[1]
, which means the longitude we can make a plot for u[0, :, :, idx[1]]
which is the zonal wind for all levels, all latitudes in the nearest longitude to -46.616667
In [21]:
fig = plt.figure(figsize=(20,8))
ax = fig.add_subplot(111)
newlats, levs = np.meshgrid(lats[:, idx[1]], level)
clevs = np.linspace(-50, 50, 8)
cs = ax.contourf(newlats, levs, u[0, :, :, idx[1]], clevs, cmap='PiYG', extend='both')
cbar = plt.colorbar(cs)
ax.invert_yaxis()
The idea is to extract a time series for the variables near the surface. This means a dataframe of this kind:
2016-01-01 300.700012
2016-01-02 299.500000
2016-01-03 299.279999
2016-01-04 298.349976
...
First we need to take a look in the time variable...
In [22]:
print time[:]
Dates in netcdf files are generally computed as hours or days since an especific date. The ncdump -h
for file r2.nc:
double time(time) ;
time:standard_name = "time" ;
time:long_name = "Time" ;
time:bounds = "time_bnds" ;
time:units = "hours since 1800-1-1 00:00:00" ;
time:calendar = "standard" ;
time:axis = "T" ;
In [23]:
print time.units
We may now import datetime
and timedelta
to make a list with datetime objects instead of those float. In the netCDF4 this convertion is as easy as dates = netCDF4.num2date(time[:], time.units)
In [24]:
from datetime import datetime, timedelta
In [25]:
def get_dates(time):
splited_date = time.units.split()
# splited_date = ['hours', 'since', '1800-1-1', '00:00:00']
start_date_string = ' '.join(splited_date[-2:])
# start_date_string = 1800-1-1 00:00:00
# convert string into datetime object
start_date = datetime.strptime(start_date_string, '%Y-%m-%d %H:%M:%S')
dates = [start_date + timedelta(hours=i) for i in time[:]]
return dates
In [26]:
dates = get_dates(time)
print dates
We read now the variables near the surface (t2, u10, v10, prate)
In [27]:
u10 = f.variables['u10']
v10 = f.variables['v10']
prate = f.variables['prate']
tmax = f.variables['tmax']
tmin = f.variables['tmin']
print u10.shape, prate.units
The values of the variables are then unpacked and the grid is fliped from 0 to 360 to -180 to 180
In [28]:
u10 = unpack(u10)
v10 = unpack(v10)
prate = unpack(prate)
tmax = unpack(tmax)
tmin = unpack(tmin)
u10, newlon = flip_grid(u10, longitude)
v10, newlon = flip_grid(v10, longitude)
prate, newlon = flip_grid(prate, longitude)
tmax, newlon = flip_grid(tmax, longitude)
tmin, newlon = flip_grid(tmin, longitude)
In [29]:
lons, lats = np.meshgrid(newlon, latitude)
We find then grid point of the latitude and longitude that we want
In [30]:
idx = find_nearest(-46.248889, -22.880833, lons, lats)
Import pandas to create a dataframe with dates as index and then save it to a csv file
In [31]:
import pandas as pd
tx2C = tmax - 273.15 # K to oC
tm2C = tmin - 273.15 # K to oC
acum = prate * 3600. * 24. # mm/s mm/day
data = {'u10': u10[:, 0, idx[0], idx[1]], 'v10': v10[:, 0, idx[0], idx[1]], 'prec': acum[:, idx[0], idx[1]], \
'tmax': tx2C[:, 0, idx[0], idx[1]], 'tmin': tm2C[:, 0, idx[0], idx[1]]}
df = pd.DataFrame(data, index=dates)
print df
In [37]:
df.to_csv('./data/modeldata.csv', sep=';')