HRES: the minimum or maximum values are in the hour ending at the forecast step
, https://confluence.ecmwf.int/display/CKB/ERA5+data+documentation#ERA5datadocumentation-Minimum/maximumsincethepreviouspostprocessingHowever, the ECMWF interpolation software does not conserve area integrals, ...
, which makes one wonder if data interpolated to 0.25 degree grid actually is area conservative or not? Also, modern grib reading software (pygrib, java CDM) do not have problem in reading the native grid grib files and returning the correct 0.28125 resolution, so interpolation of data before giving to users, seems unnecessary.First, we analyse 2m temperature forecast vs minimum 2m temperature forecast, showing the time step difference and problem with new analysis cycle. Then, we show that forecasted mn2t can be higher than analysed 2t over large areas, questioning how they can be used together.
We get sample data via ECMWF API (soon to be closed).
{'type': 'fc', 'levtype': 'sfc', 'step': '1/2/3/4/5/6/7/8/9/10/11', 'param': '167.128', 'dataset': 'era5', 'stream': 'oper', 'target': '/data.e2/ERA5_analysis/era5_167.128_2011_1_fc.grb2', 'expver': '1', 'class': 'ea', 'date': '2011-01-01/to/2011-01-31', 'time': '06:00:00/18:00:00'}
s3://era5-pds/QA/
In [1]:
%matplotlib notebook
import os
import boto3
import pygrib
import numpy as np
import matplotlib.pyplot as plt
import datetime
import pandas as pd
In [4]:
## lets analyse grib directly for January 2011.
an_2t = 'era5_167.128_2011_1_an_v1.grb2'
fc_2t = 'era5_167.128_2011_1_fc_v2.grb2'
mn2t_fc = 'era5_202.128_2011_1_fc_v2.grb2'
for ff in [an_2t, fc_2t, mn2t_fc]:
if not os.path.exists(ff):
session = boto3.Session()
client = session.client('s3')
client.download_file('era5-pds','QA/' + ff, ff)
fmin = pygrib.open(mn2t_fc)
ft2m = pygrib.open(fc_2t)
ft2m_an = pygrib.open(an_2t)
In [5]:
window = 2 # total steps is window * 2 + 1
max_count = ft2m_an.messages - window -7
max_count = 12
def get_grib_time_step(in_fld):
"""
convert grib message time to timestamp, using step. Attention when using with aggregated fields like mn2t
"""
in_date = datetime.datetime(int(str(in_fld.date)[0:4]), int(str(in_fld.date)[4:6]), int(str(in_fld.date)[6:8]))
in_time = datetime.timedelta(hours=in_fld.time/100)
in_step = datetime.timedelta(hours=in_fld.step)
return in_date + in_time + in_step
def get_diff(pre, suf, window, pre_suf_diff):
return {tstep_diff: [
(np.mean(pre[tstep_an].data()[0] - suf[tstep_an-pre_suf_diff+tstep_diff].data()[0]),
np.amin(pre[tstep_an].data()[0] - suf[tstep_an-pre_suf_diff+tstep_diff].data()[0]),
np.sum(np.where((pre[tstep_an].data()[0] - suf[tstep_an-pre_suf_diff+tstep_diff].data()[0])<-1,1,0)),
tstep_an, pre[tstep_an].step, suf[tstep_an-pre_suf_diff+tstep_diff].step,
get_grib_time_step(pre[tstep_an]), get_grib_time_step(suf[tstep_an-pre_suf_diff+tstep_diff]))
for tstep_an in range(pre_suf_diff+1+window, max_count+pre_suf_diff+1+window)]
for tstep_diff in range(-1*window, window+1)
}
## an2t_mn2t_diff -- difference between analysed 2t and forecasted mn2t. Note that analysis starts 7 hours before forecast.
## an2t_fc2t_diff -- difference between analysed 2t and forecasted 2t
## fc2t_mn2t_diff -- difference between forecasted 2t and forecasted mn2t
an2t_mn2t_diff = get_diff(ft2m_an, fmin, window, 7)
#an2t_fc2t_diff = get_diff(ft2m_an, ft2m, window, 0)
fc2t_mn2t_diff = get_diff(ft2m, fmin, window, 0)
In [6]:
def get_sm(infld, diff_add=0):
rd = {'Step diff':[], "min":[], "mean":[], "median min":[],
"diff > -2 mean":[],
"diff > -2 max":[],
"diff > -2 min":[]}
for k,v in infld.items():
atemp = np.array(v)
date_diff1 = np.diff(atemp[:,6])
date_diff2 = np.diff(atemp[:,7])
date_diff3 = list(map(datetime.timedelta.total_seconds, atemp[:,7] - atemp[:,6]))
assert np.amin(date_diff1) == np.amax(date_diff1)
assert np.amin(date_diff2) == np.amax(date_diff2)
assert np.isclose(float(np.mean(date_diff3)), float(k * 3600)), "{0} {1} {2} {3}".format(k, diff_add, np.mean(date_diff3), (k+diff_add)*3600)
rd['Step diff'].append(k)
rd['min'].append(np.amin(atemp[:,1]))
rd['mean'].append(np.mean(atemp[:,0]))
rd['median min'].append(np.median(atemp[:,1]))
rd['diff > -2 mean'].append(np.mean(atemp[:,2]))
rd['diff > -2 max'].append(np.amax(atemp[:,2]))
rd['diff > -2 min'].append(np.amin(atemp[:,2]))
return pd.DataFrame(rd).set_index('Step diff')
In [7]:
get_sm(fc2t_mn2t_diff)
Out[7]:
Step diff == 1 shows a case where minimum value of 2t-mn2t is mostly zero, except for the step 12 of 2t and 1 of mn2t. This can be illustrated by setting step_diff = 1 in the next graph.
In [8]:
step_diff = 1
atemp=np.array(fc2t_mn2t_diff[step_diff])
fig, ax_1 = plt.subplots()
ax_2 = ax_1.twinx()
ax_1.plot(atemp[:,3],atemp[:,4], label='2t step')
ax_1.plot(atemp[:,3],atemp[:,5], label='mn2t step')
ax_1.set_ylabel("step")
ax_2.plot(atemp[:,3],atemp[:,1], '-*', c='r', label='min of 2t - mn2t')
ax_2.set_ylabel("2t - mn2t")
ax_2.set_ylim([-10,1])
ax_1.grid()
ax_1.legend()
ax_2.legend()
plt.show()
In [9]:
get_sm(an2t_mn2t_diff)
Out[9]:
Apparenlty, for any timestep, there is a large number of grid points, where minimum temperature is more than 2 degrees higher than corresponding hourly temperature. Looking at single timesteps:
In [10]:
atemp=np.array(an2t_mn2t_diff[1])
fig, ax_1 = plt.subplots(figsize=(7,7))
ax_2 = ax_1.twinx()
ax_2.set_ylabel("nr of net p's")
ax_1.plot(atemp[:,3],atemp[:,1])
ax_2.plot(atemp[:,3], atemp[:,2], c='r', label="nr of neg p's")
ax_2.legend()
plt.title("Min 2t-mn2t, number of gridpoints (2t-mn2t)<-2")
plt.show()
To have a more visual view, let's look at full data in one timestep
In [11]:
tstep = 10
tstep_min = tstep + 1
anstep = tstep + 7
assert get_grib_time_step(ft2m_an[anstep])==get_grib_time_step(fmin[tstep]), "{0} {1}".format(get_grib_time_step(ft2m_an[anstep]), get_grib_time_step(fmin[tstep]))
fig=plt.figure()
td = np.flipud(ft2m_an[anstep].data()[0] - fmin[tstep_min].data()[0])
plt.pcolormesh(np.ma.masked_where(td > -0.1,td), cmap=plt.cm.Blues)
plt.colorbar()
plt.title(fmin[tstep+1].step)
plt.show()
Similar for 2t forecast vs mn2t
In [12]:
tstep = 10
tstep_min = tstep + 1
anstep = tstep + 7
assert get_grib_time_step(ft2m_an[anstep])==get_grib_time_step(ft2m[tstep]), "{0} {1}".format(get_grib_time_step(ft2m_an[anstep]), get_grib_time_step(ft2m[tstep]))
fig=plt.figure()
td = np.flipud(ft2m_an[anstep].data()[0] - ft2m[tstep_min].data()[0])
plt.pcolormesh(np.ma.masked_where(td > -0.1,td), cmap=plt.cm.Blues)
plt.colorbar()
plt.title(fmin[tstep].step)
plt.show()
In [ ]: