ERA5 tutorial

ERA5 contains historical weather data, which can be used to analyse very wide range of problems. In this tutorial we briefly demonstrate how to:

Get timerange data for single location and single variable
Analyse possible wind farm production potential for three different locations

To read more about the ERA5 dataset, please follow these links:

To learn more about Planet OS/Intertrust datahub, please refer to documentation in https://data.planetos.com/datasets



In [128]:

    
# Initialize notebook environment.
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

# Import Planet OS API
from API_client.python.datahub import datahub_main
from API_client.python.lib.dataset import dataset
from API_client.python import package_api

Initialize dataset, print available variable names. In order to be able to access the data, save your API key in the file APIKEY and put it into the same folder where you run this notebook.



In [129]:

    
apikey = open('APIKEY').readlines()[0].strip()
dh = datahub_main(apikey)
ds = dataset('ecmwf_era5',dh,debug=True)



In [4]:

    
ds.variable_names()









    Out[4]:





['100_metre_U_wind_component_surface',
 '100_metre_V_wind_component_surface',
 '10_metre_U_wind_component_surface',
 '10_metre_V_wind_component_surface',
 '2_metre_dewpoint_temperature_surface',
 '2_metre_temperature_surface',
 'Maximum_temperature_at_2_metres_surface_1hour_Maximum',
 'Mean_sea_level_pressure_surface',
 'Minimum_temperature_at_2_metres_surface_1hour_Minimum',
 'Snow_density_surface',
 'Snow_depth_surface',
 'Surface_pressure_surface',
 'Surface_solar_radiation_downwards_surface_1hour_Accumulation',
 'Total_precipitation_surface_1hour_Accumulation',
 'lat',
 'lon',
 'time0',
 'time1']



In [5]:

    
# Choose location, variable
lon = 26
lat = 58
variable = '2_metre_temperature_surface'



In [22]:

    
d1 = ds.get_json_data_in_pandas(count=100000,**{'vars':'2_metre_temperature_surface','lon':lon,'lat':lat})



In [132]:

    
fig=plt.figure()
plt.plot(d1['time'],d1['2_metre_temperature_surface']-273.15)
plt.ylabel('Temperature $C \degree$')
plt.grid()
fig.autofmt_xdate()
plt.show()

Wind energy example

As a more advanced use case, let's try to analyse potential wind energy production in three different locations. For this, we take wind speed at 100 m height, apply a simplified energy production curve to hourly data, and get a statistics for hourly and weekly average (wa) data.



In [133]:

    
def production_curve(wind_speed):
    """ Production curve of a turbine/farm can be simplified as a 
    cube of wind speed, with production 
         starting from a particular wind speed        'v_cut_in'
         getting maximum output power at wind speed   'v_rated'
         production halted at very strong wind speeds 'v_cut_off'
    Note that this simplified approach is useful for comparing places only.
    """
    v_cut_in = 5
    v_rated = 15
    v_cut_off = 25
    max_production = 15**3
    rt = np.zeros_like(wind_speed)
    rt = np.where(wind_speed>v_cut_in, wind_speed**3,wind_speed)
    rt = np.where(wind_speed>v_rated,max_production,rt)
    rt = np.where(wind_speed>v_cut_off,0,rt)
    return rt
    
def wind_production_smooth(wspd):
    """Weekly average production as a moving average"""
    def moving_average(a, n=3) :
        ret = np.cumsum(a, dtype=float)
        ret[n:] = ret[n:] - ret[:-n]
        return ret[n - 1:] / n
    return moving_average(np.array(wspd), n=7*24)

def station_statistics(lonlats, count = 100):
    """
    Compute min, max, mean, 5'th and 95'th percentiles for both 
    hourly values and weekly averages
    """
    ddwd = [(name,ds.get_json_data_in_pandas(count=count, 
                **{'vars':'100_metre_U_wind_component_surface,100_metre_V_wind_component_surface',
                'lon':lon,
                'lat':lat})) for name,lon,lat in lonlats]
    wspds = [(name,np.sqrt(dd['100_metre_U_wind_component_surface']**2 + dd['100_metre_V_wind_component_surface']**2))
              for name,dd in ddwd]
    pcurves = [(name,production_curve(dd)) for name, dd in wspds]
    weekly_ave = [wind_production_smooth(dd) for name,dd in pcurves]    
    retdic = {}
    for dd, wa in zip(pcurves,weekly_ave):
        retdic[dd[0]] = (np.amin(dd[1]),np.amax(dd[1]),np.mean(dd[1]),np.percentile(dd[1],5),np.percentile(dd[1],95),
                         np.amin(wa), np.amax(wa), np.mean(wa), np.percentile(wa,5), np.percentile(wa,95),
                        np.sum(dd[1]))                    
        
    return pd.DataFrame(retdic, index=['min','max','mean','5 percentile', '95 percentile',
                                       'wa min','wa max','wa mean','wa 5 percentile', 'wa 95 percentile', 'sum'])

## Power curve demo
x = np.arange(0,30, 0.2)
fig = plt.figure()
plt.plot(x,production_curve(x))
plt.xlabel('Wind speed m/s')
plt.ylabel('Output power')
plt.title("Idealized turbine production curve")
plt.show()



In [134]:

    
stations = [('Hiiumaa',22.1, 59),            ## planned off-shore wind farm in North-West Estonia
            ('Tõravere',26+28/60, 58+16/60), ## Inland climate station in South-East Estonia
            ('GYM',-3 -35/60, 53+27/60)]     ## existing off-shore wind farm west to Scotland
abc = station_statistics(stations,count=10000)



In [135]:

    
abc.transpose()









    Out[135]:







  
    
      
      min
      max
      mean
      5 percentile
      95 percentile
      wa min
      wa max
      wa mean
      wa 5 percentile
      wa 95 percentile
      sum
    
  
  
    
      GYM
      0.000000
      3375.0
      781.445151
      2.359436
      3375.000000
      61.956120
      2398.221162
      780.378387
      202.650405
      1737.967417
      7.814452e+06
    
    
      Hiiumaa
      0.050631
      3375.0
      770.282982
      2.309145
      2944.666319
      71.048125
      2062.972979
      777.684667
      231.585537
      1564.945427
      7.702830e+06
    
    
      Tõravere
      0.117344
      3375.0
      403.100573
      1.921589
      1532.588197
      22.684435
      1380.830043
      405.582212
      65.306773
      835.503148
      4.031006e+06



In [ ]:

	min	max	mean	5 percentile	95 percentile	wa min	wa max	wa mean	wa 5 percentile	wa 95 percentile	sum
GYM	0.000000	3375.0	781.445151	2.359436	3375.000000	61.956120	2398.221162	780.378387	202.650405	1737.967417	7.814452e+06
Hiiumaa	0.050631	3375.0	770.282982	2.309145	2944.666319	71.048125	2062.972979	777.684667	231.585537	1564.945427	7.702830e+06
Tõravere	0.117344	3375.0	403.100573	1.921589	1532.588197	22.684435	1380.830043	405.582212	65.306773	835.503148	4.031006e+06