EMOS Network

The goal of this notebook is to build and test a network implementation of EMOS, once in pure theano and then in keras. First we will try to replicate the results of the standard global EMOS.

The reference period is always the mean CRPS for all dates and stations in 2016.


In [27]:
# Imports
# Note that for the theano model to work, the keras theano backend must be used!
from importlib import reload
import emos_network_theano; reload(emos_network_theano)
from  emos_network_theano import EMOS_Network
from losses import crps_cost_function
import utils; reload(utils)
from utils import *
import keras_models; reload(keras_models)
from keras_models import *


Anaconda environment: py36_keras
Darwin 17.2.0

In [2]:
# Basic setup
DATA_DIR = '/Volumes/STICK/data/ppnn_data/'  # Mac
# DATA_DIR = '/project/meteo/w2w/C7/ppnn_data/'   # LMU
results_dir = '../results/'   # Where to save post-processed predictions
window_size = 25   # Days in rolling window
fclt = 48   # Forecast lead time in hours

Prepare the data

The interpolated raw data set contains the observations and forecasts from the 50 members for each station and each day. Using the get_train_test_sets function we will convert this raw dataset to a format suitable for the networks.

Here we pick a forecast date and return as the training data all previous days within the window size previous to the start of the forecast. The test data is simply the data for the chosen forecast date. The 50 member ensemble is summarized by the mean and the standard deviation. Additionally, we remove all data where the observations are missing. Finally, the inputs are scaled. For this we simply divide each feature by its maximum in the training set.


In [3]:
# Chose a random date to illustrate the algorithm
date_str = '2011-02-14'   # This is our standard date format

In [4]:
# Load training and test set
train_set, test_set = get_train_test_sets(DATA_DIR, predict_date=date_str,
                                          fclt=fclt, window_size=window_size)


train set contains 25 days
test set contains 1 days

These two sets are objects which contain all the data and some meta information.


In [5]:
train_set.feature_names


Out[5]:
['t2m_fc_mean', 't2m_fc_std']

In [6]:
train_set.features.shape, train_set.targets.shape, train_set.date_strs.shape


Out[6]:
((12619, 2), (12619,), (12619,))

In [7]:
test_set.features.shape, test_set.targets.shape, test_set.date_strs.shape


Out[7]:
((503, 2), (503,), (503,))

Theano Implementation

To start with we build the model in pure theano. The model is defined in a separate script EMOS_network_theano.py. The network uses a custom CRPS loss function which is defined in crps_loss.py.

The EMOS_Network class is build to work in a similar way to keras models. For the fitting we are using gradient descent. Since we are using the entire dataset for each update, it is not stochastic. An early stopping algorithm is built into the fitting function. It stops training if the average training CRPS of the last 5 steps is decreasing by less than a parameter delta.

Train for a single day

To illustrate how the model work we will use the data for our example day above and fit the model.


In [8]:
# Define some model parameters
lr = np.asarray(0.1, dtype='float32')   # The learning rate
early_stopping_delta = 1e-4   # How much the CRPS must improve before stopping
steps_max = 1000   # How many steps to fit at max

In [9]:
# Set up the theano model
model_theano = EMOS_Network()

In [10]:
# Split the features into means and standard deviation
train_set.feature_names


Out[10]:
['t2m_fc_mean', 't2m_fc_std']

In [11]:
train_mean = train_set.features[:, 0]
train_std = train_set.features[:, 1]
test_mean = test_set.features[:, 0]
test_std = test_set.features[:, 1]

In [12]:
# Train the model for some steps
model_theano.fit(train_mean, train_std, train_set.targets, steps_max, 
                 (test_mean, test_std, test_set.targets), lr=lr, 
                 early_stopping_delta=early_stopping_delta)
# Output is the training CRPS and the test CRPS


Out[12]:
(array(1.1364955434365136), array(0.7820005313801467))

Post processing for all of 2016 with a rolling window

To compare the network model with the standard EMOS we will run it from 1 January 2016 to 31 December 2016. When looping over the days we are not resetting the model weights for each day. This drastically reduces the training time with identical results.


In [13]:
# Get start and stop indices
date_str_start = '2016-01-01'
date_str_stop = '2017-01-01'

In [14]:
model_theano = EMOS_Network()

In [17]:
# This function loops over the days.
train_crps_list, valid_crps_list, results_df = loop_over_days(
    DATA_DIR,
    model_theano,
    date_str_start, date_str_stop, 
    window_size=window_size,
    fclt=fclt,     
    epochs_max=steps_max, 
    early_stopping_delta=early_stopping_delta, 
    lr=lr,
    verbose=0,
    model_type='EMOS_Network_theano')


100%|██████████| 366/366 [05:32<00:00,  1.10it/s]

Note that before restructuring the data preparation function, this was significantly slower. It might have to do with the creation of the meta data arrays.


In [18]:
# Let's see what the mean prediction CRPS is
np.mean(valid_crps_list)


Out[18]:
1.0196554376205749

In [19]:
# Save the results
results_df.to_csv(results_dir + 'emos_network_rolling_window.csv')

This file is then read by the evaluation script.

Keras implementation

Now let's build the same model in keras. This will provide a good starting point to expand the model later on.


In [20]:
model_keras = build_EMOS_network_keras(compile=True, optimizer='sgd', lr=0.1)

In [21]:
model_keras.summary()


____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_1 (InputLayer)             (None, 1)             0                                            
____________________________________________________________________________________________________
input_2 (InputLayer)             (None, 1)             0                                            
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 1)             2           input_1[0][0]                    
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 1)             2           input_2[0][0]                    
____________________________________________________________________________________________________
concatenate_1 (Concatenate)      (None, 2)             0           dense_1[0][0]                    
                                                                   dense_2[0][0]                    
====================================================================================================
Total params: 4
Trainable params: 4
Non-trainable params: 0
____________________________________________________________________________________________________

Predict for one day

In keras the early stopping algorithm works slightly differently. It stops training once the training loss hasn't decreased by an amount delta in a certain number of steps (patience).


In [22]:
# This way we have the gradient descent on the whole training set just as in theano
batch_size = train_mean.shape[0]   
batch_size


Out[22]:
12619

In [23]:
model_keras.fit([train_mean, train_std], train_set.targets, epochs=steps_max, 
                batch_size=batch_size,
                validation_data=[[test_mean, test_std], test_set.targets], 
                verbose=0,
                callbacks=[EarlyStopping(monitor='loss', 
                                         min_delta=early_stopping_delta,
                                         patience=2)]);


Out[23]:
<keras.callbacks.History at 0x111c78320>

In [24]:
# Get train and test CRPS
(model_keras.evaluate([train_mean, train_std], train_set.targets, batch_size, verbose=0), 
 model_keras.evaluate([test_mean, test_std], test_set.targets, batch_size, verbose=0))


Out[24]:
(1.1318239363667209, 0.77177857220961432)

We get very similar results to the theano implementation.

Post-processing for 2016

Same as above with the theano model.


In [28]:
model_keras = build_EMOS_network_keras(compile=True, optimizer='sgd', lr=0.1)

In [29]:
date_str_start = '2016-01-01'
date_str_stop = '2017-01-01'
# This function loops over the days.
train_crps_list, valid_crps_list, results_df = loop_over_days(
    DATA_DIR,
    model_keras,
    date_str_start, date_str_stop, 
    window_size=window_size,
    fclt=fclt,     
    epochs_max=steps_max, 
    early_stopping_delta=early_stopping_delta, 
    lr=lr,
    verbose=0,
    model_type='EMOS_Network_keras')


  0%|          | 0/366 [00:00<?, ?it/s]
100%|██████████| 366/366 [08:22<00:00,  1.08s/it]

The keras implementation is slower than the pure theano version. This could be due to the overhead of calling model.fit many many times.


In [30]:
np.mean(train_crps_list), np.mean(valid_crps_list)


Out[30]:
(0.99209009628307276, 1.007831244335595)

The results are slightly better than the theano implementation. This could be due to random variability or due to the difference in the early stopping algorithm.


In [31]:
results_df.to_csv(results_dir + 'emos_network_rolling_window_keras.csv')

Train 2015, predict 2016

Finally, we will train one single model on all of the 2015 data, and then post-process all of 2016 with this one model.


In [32]:
train_dates = ['2015-01-01', '2016-01-01']
test_dates =  ['2016-01-01', '2017-01-01']

In [33]:
# Load data sets
train_set, test_set = get_train_test_sets(DATA_DIR, train_dates, test_dates)


train set contains 365 days
test set contains 366 days

In [34]:
model_keras = build_EMOS_network_keras(compile=True, optimizer='adam', lr=0.1)

In [35]:
# Split dataset
train_mean = train_set.features[:, 0]
train_std = train_set.features[:, 1]
test_mean = test_set.features[:, 0]
test_std = test_set.features[:, 1]

In [36]:
model_keras.fit([train_mean, train_std], train_set.targets, epochs=10, 
                batch_size=1024, 
                validation_data=[[test_mean, test_std], test_set.targets])


Train on 180849 samples, validate on 182218 samples
Epoch 1/10
180849/180849 [==============================] - 0s - loss: 2.9971 - val_loss: 1.9343
Epoch 2/10
180849/180849 [==============================] - 0s - loss: 1.3775 - val_loss: 1.0720
Epoch 3/10
180849/180849 [==============================] - 0s - loss: 1.0752 - val_loss: 1.0134
Epoch 4/10
180849/180849 [==============================] - 0s - loss: 1.0694 - val_loss: 1.0116
Epoch 5/10
180849/180849 [==============================] - 0s - loss: 1.0692 - val_loss: 1.0130
Epoch 6/10
180849/180849 [==============================] - 0s - loss: 1.0692 - val_loss: 1.0118
Epoch 7/10
180849/180849 [==============================] - 0s - loss: 1.0693 - val_loss: 1.0123
Epoch 8/10
180849/180849 [==============================] - 0s - loss: 1.0693 - val_loss: 1.0123
Epoch 9/10
180849/180849 [==============================] - 0s - loss: 1.0694 - val_loss: 1.0120
Epoch 10/10
180849/180849 [==============================] - 0s - loss: 1.0693 - val_loss: 1.0114
Out[36]:
<keras.callbacks.History at 0x11920b2e8>

So we get a very similar CRPS compared to the 25 day rolling window. This suggests that the seasonality is not that important.


In [37]:
# Get predictions
preds = model_keras.predict([test_mean, test_std])

In [38]:
# Save predictions
results_df = create_results_df(test_set.date_strs, test_set.station_ids,
                               preds[:, 0], preds[:, 1])

In [39]:
results_df.head()


Out[39]:
date mean station_id std
0 2016-01-01 4.426320 44.0 -1.643140
1 2016-01-01 1.619249 71.0 -2.377700
2 2016-01-01 0.592518 73.0 -1.763448
3 2016-01-01 4.381699 78.0 -1.691318
4 2016-01-01 1.948497 91.0 -2.535466

In [40]:
results_df.to_csv(results_dir + 'emos_network_train_2015_pred_2016.csv')

In [ ]: