This document describes the algorithmic aspects of time series forecasting in PyAF. We will describe:
Warning : This document ins intended for advanced uses of PyAF. The aspects described here are not useful in a typical forecasting use case.
PyAF uses a machine learning approach to forecasting. A lot of time series models are generated and their forecasting quality is compared on a validation dataset (most recent part of the whole signal). To summarize, PyAF is performing a competition between a large set of possible models/hypothesis and selecting the best to perform the final forecast.
the models used/tested in PyAF are signal decompositions generated on the fly internally. An additive signal decomposition is the sum of a trend (long term) , periodic and an irregular component as described in http://en.wikipedia.org/wiki/Decomposition_of_time_series
PyAF generates tens of possible decomposition for the input signal and outputs the best. One can control the amount/types of decompositions, enable/disable such of such component, and review the performance of each decomposition internally.
In addition to the decomposition , PyAF allows a whole set of possible signal transformations performed in a pre-processing phase (before decomposition) and a post-processing step (after forecasting).
PyAF performs the forecasting task of a signal $X_t$ in three steps described below:
PyAF supports the following operations :
lKnownTransformations = ['None', 'Difference', 'RelativeDifference','Integration', 'BoxCox', 'Quantization', 'Logit', 'Fisher', 'Anscombe'];
lKnownTrends = ['ConstantTrend', 'Lag1Trend', 'LinearTrend', 'PolyTrend','MovingAverage', 'MovingMedian'];
lKnownPeriodics = ['NoCycle', 'BestCycle', 'Seasonal_MonthOfYear' , 'Seasonal_Second' ,'Seasonal_Minute' ,'Seasonal_Hour' ,'Seasonal_DayOfWeek' , 'Seasonal_DayOfMonth', 'Seasonal_WeekOfYear'];
lKnownAutoRegressions = ['NoAR' , 'AR' , 'ARX' , 'SVR' , 'MLP' , 'LSTM'];
Optional Transformations:
These models are built of the residues of trend and cycles:
$$Z_t = Y_t - T_t - C_t $$We consider here some models based on the residue lags ($Lag(Z)_t = Z_{t-k}$).
The models described here as implemented using external libraries, either scikit-learn (AR, ARX, SVR) or keras (MLP, LSTM).
Experimental Models :
The parameter $p$ repesents the dependency on the past. It can be customized.
In [ ]: