`auto_arima`

Pmdarima bring R's auto.arima functionality to Python by wrapping statsmodel ARIMA and SARIMAX models into a singular scikit-learn-esque estimator (pmdarima.arima.ARIMA) and adding several layers of degree and seasonal differencing tests to identify the optimal model parameters.

Pmdarima ARIMA models:

Are fully picklable for easy persistence and model deployment
Can handle seasonal terms (unlike statsmodels ARIMAs)
Follow sklearn model fit/predict conventions



In [1]:

    
import numpy as np
import pmdarima as pm

print('numpy version: %r' % np.__version__)
print('pmdarima version: %r' % pm.__version__)









    



numpy version: '1.15.3'
pmdarima version: '1.3.0-dev0'

We'll start by defining an array of data from an R time-series, wineind:

> forecast::wineind
       Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
1980 15136 16733 20016 17708 18019 19227 22893 23739 21133 22591 26786 29740
1981 15028 17977 20008 21354 19498 22125 25817 28779 20960 22254 27392 29945
1982 16933 17892 20533 23569 22417 22084 26580 27454 24081 23451 28991 31386
1983 16896 20045 23471 21747 25621 23859 25500 30998 24475 23145 29701 34365
1984 17556 22077 25702 22214 26886 23191 27831 35406 23195 25110 30009 36242
1985 18450 21845 26488 22394 28057 25451 24872 33424 24052 28449 33533 37351
1986 19969 21701 26249 24493 24603 26485 30723 34569 26689 26157 32064 38870
1987 21337 19419 23166 28286 24570 24001 33151 24878 26804 28967 33311 40226
1988 20504 23060 23562 27562 23940 24584 34303 25517 23494 29095 32903 34379
1989 16991 21109 23740 25552 21752 20294 29009 25500 24166 26960 31222 38641
1990 14672 17543 25453 32683 22449 22316 27595 25451 25421 25288 32568 35110
1991 16052 22146 21198 19543 22084 23816 29961 26773 26635 26972 30207 38687
1992 16974 21697 24179 23757 25013 24019 30345 24488 25156 25650 30923 37240
1993 17466 19463 24352 26805 25236 24735 29356 31234 22724 28496 32857 37198
1994 13652 22784 23565 26323 23779 27549 29660 23356

Note that the frequency of the data is 12:

> frequency(forecast::wineind)
[1] 12



In [2]:

    
from pmdarima.datasets import load_wineind

# this is a dataset from R
wineind = load_wineind().astype(np.float64)

Fitting an ARIMA

We will first fit a seasonal ARIMA. Note that you do not need to call auto_arima in order to fit a model—if you know the order and seasonality of your data, you can simply fit an ARIMA with the defined hyper-parameters:



In [3]:

    
from pmdarima.arima import ARIMA

fit = ARIMA(order=(1, 1, 1), seasonal_order=(0, 1, 1, 12)).fit(y=wineind)

Also note that your data does not have to exhibit seasonality to work with an ARIMA. We could fit an ARIMA against the same data with no seasonal terms whatsoever (but it is unlikely that it will perform better; quite the opposite, likely).



In [4]:

    
fit = ARIMA(order=(1, 1, 1), seasonal_order=None).fit(y=wineind)

Finding the optimal model hyper-parameters using `auto_arima`:

If you are unsure (as is common) of the best parameters for your model, let auto_arima figure it out for you. auto_arima is similar to an ARIMA-specific grid search, but (by default) uses a more intelligent stepwise algorithm laid out in a paper by Hyndman and Khandakar (2008). If stepwise is False, the models will be fit similar to a gridsearch. Note that it is possible for auto_arima not to find a model that will converge; if this is the case, it will raise a ValueError.

Fitting a stepwise search:



In [5]:

    
# fitting a stepwise model:
stepwise_fit = pm.auto_arima(wineind, start_p=1, start_q=1, max_p=3, max_q=3, m=12,
                             start_P=0, seasonal=True, d=1, D=1, trace=True,
                             error_action='ignore',  # don't want to know if an order does not work
                             suppress_warnings=True,  # don't want convergence warnings
                             stepwise=True)  # set to stepwise

stepwise_fit.summary()









    



Fit ARIMA: order=(1, 1, 1) seasonal_order=(0, 1, 1, 12); AIC=3066.760, BIC=3082.229, Fit time=0.725 seconds
Fit ARIMA: order=(0, 1, 0) seasonal_order=(0, 1, 0, 12); AIC=3133.376, BIC=3139.564, Fit time=0.029 seconds
Fit ARIMA: order=(1, 1, 0) seasonal_order=(1, 1, 0, 12); AIC=3099.734, BIC=3112.109, Fit time=0.334 seconds
Fit ARIMA: order=(0, 1, 1) seasonal_order=(0, 1, 1, 12); AIC=3066.930, BIC=3079.305, Fit time=0.229 seconds
Fit ARIMA: order=(1, 1, 1) seasonal_order=(1, 1, 1, 12); AIC=3067.548, BIC=3086.110, Fit time=1.106 seconds
Fit ARIMA: order=(1, 1, 1) seasonal_order=(0, 1, 0, 12); AIC=3088.088, BIC=3100.463, Fit time=0.144 seconds
Fit ARIMA: order=(1, 1, 1) seasonal_order=(0, 1, 2, 12); AIC=3068.000, BIC=3086.563, Fit time=1.005 seconds
Fit ARIMA: order=(1, 1, 1) seasonal_order=(1, 1, 2, 12); AIC=3068.915, BIC=3090.571, Fit time=2.380 seconds
Fit ARIMA: order=(2, 1, 1) seasonal_order=(0, 1, 1, 12); AIC=3067.447, BIC=3086.010, Fit time=0.435 seconds
Fit ARIMA: order=(1, 1, 0) seasonal_order=(0, 1, 1, 12); AIC=3094.571, BIC=3106.946, Fit time=0.242 seconds
Fit ARIMA: order=(1, 1, 2) seasonal_order=(0, 1, 1, 12); AIC=3066.742, BIC=3085.305, Fit time=0.524 seconds
Fit ARIMA: order=(2, 1, 3) seasonal_order=(0, 1, 1, 12); AIC=3070.634, BIC=3095.384, Fit time=1.342 seconds
Fit ARIMA: order=(1, 1, 2) seasonal_order=(1, 1, 1, 12); AIC=3068.010, BIC=3089.666, Fit time=0.704 seconds
Fit ARIMA: order=(1, 1, 2) seasonal_order=(0, 1, 0, 12); AIC=3090.957, BIC=3106.426, Fit time=0.139 seconds
Fit ARIMA: order=(1, 1, 2) seasonal_order=(0, 1, 2, 12); AIC=3067.731, BIC=3089.387, Fit time=1.046 seconds
Fit ARIMA: order=(1, 1, 2) seasonal_order=(1, 1, 2, 12); AIC=3069.703, BIC=3094.453, Fit time=2.791 seconds
Fit ARIMA: order=(0, 1, 2) seasonal_order=(0, 1, 1, 12); AIC=nan, BIC=nan, Fit time=nan seconds
Fit ARIMA: order=(2, 1, 2) seasonal_order=(0, 1, 1, 12); AIC=3068.673, BIC=3090.330, Fit time=0.650 seconds
Fit ARIMA: order=(1, 1, 3) seasonal_order=(0, 1, 1, 12); AIC=3068.810, BIC=3090.467, Fit time=0.730 seconds
Total fit time: 14.564 seconds






    Out[5]:





Statespace Model Results

  Dep. Variable:                   y                  No. Observations:      176   


  Model:            SARIMAX(1, 1, 2)x(0, 1, 1, 12)    Log Likelihood      -1527.371


  Date:                    Mon, 27 May 2019           AIC                 3066.742 


  Time:                        11:17:17               BIC                 3085.305 


  Sample:                          0                  HQIC                3074.278 


                                - 176                                             


  Covariance Type:                opg                                              




               coef      std err       z       P>|z|   [0.025     0.975]  


  intercept   -100.7331     72.197     -1.395   0.163   -242.236     40.770


  ar.L1         -0.5123      0.390     -1.312   0.189     -1.277      0.253


  ma.L1         -0.0806      0.404     -0.200   0.842     -0.872      0.711


  ma.L2         -0.4430      0.224     -1.978   0.048     -0.882     -0.004


  ma.S.L12      -0.4025      0.054     -7.448   0.000     -0.508     -0.297


  sigma2      7.663e+06    7.3e+05     10.495   0.000   6.23e+06   9.09e+06




  Ljung-Box (Q):           48.70    Jarque-Bera (JB):   21.57


  Prob(Q):                 0.16     Prob(JB):           0.00 


  Heteroskedasticity (H):  1.18     Skew:               -0.61


  Prob(H) (two-sided):     0.54     Kurtosis:           4.31 



Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
[2] Covariance matrix is singular or near-singular, with condition number 8.14e+14. Standard errors may be unstable.

Fitting a random search

If you don't want to use the stepwise search, auto_arima can fit a random search by enabling random=True. If your random search returns too many invalid (nan) models, you might try increasing n_fits or making it an exhaustive search (stepwise=False, random=False).



In [6]:

    
rs_fit = pm.auto_arima(wineind, start_p=1, start_q=1, max_p=3, max_q=3, m=12,
                       start_P=0, seasonal=True, d=1, D=1, trace=True,
                       n_jobs=-1,  # We can run this in parallel by controlling this option
                       error_action='ignore',  # don't want to know if an order does not work
                       suppress_warnings=True,  # don't want convergence warnings
                       stepwise=False, random=True, random_state=42,  # we can fit a random search (not exhaustive)
                       n_fits=25)

rs_fit.summary()









    



Total fit time: 38.591 seconds






    Out[6]:





Statespace Model Results

  Dep. Variable:                   y                  No. Observations:      176   


  Model:            SARIMAX(2, 1, 3)x(1, 1, 1, 12)    Log Likelihood      -1524.252


  Date:                    Mon, 27 May 2019           AIC                 3066.504 


  Time:                        11:19:38               BIC                 3094.348 


  Sample:                          0                  HQIC                3077.809 


                                - 176                                             


  Covariance Type:                opg                                              




               coef      std err       z       P>|z|   [0.025     0.975]  


  intercept   -112.2831    131.424     -0.854   0.393   -369.870    145.304


  ar.L1         -1.7290      0.052    -33.349   0.000     -1.831     -1.627


  ar.L2         -0.8938      0.053    -16.983   0.000     -0.997     -0.791


  ma.L1          1.1245      0.066     17.022   0.000      0.995      1.254


  ma.L2         -0.3031      0.085     -3.547   0.000     -0.471     -0.136


  ma.L3         -0.6783      0.050    -13.698   0.000     -0.775     -0.581


  ar.S.L12       0.1278      0.157      0.815   0.415     -0.179      0.435


  ma.S.L12      -0.5523      0.154     -3.586   0.000     -0.854     -0.250


  sigma2      7.548e+06      0.003   2.24e+09   0.000   7.55e+06   7.55e+06




  Ljung-Box (Q):           43.96    Jarque-Bera (JB):   16.51


  Prob(Q):                 0.31     Prob(JB):           0.00 


  Heteroskedasticity (H):  1.07     Skew:               -0.58


  Prob(H) (two-sided):     0.80     Kurtosis:           4.04 



Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
[2] Covariance matrix is singular or near-singular, with condition number 9.91e+24. Standard errors may be unstable.

Inspecting goodness of fit

We can look at how well the model fits in-sample data:



In [7]:

    
from bokeh.plotting import figure, show, output_notebook
import pandas as pd

# init bokeh
output_notebook()

def plot_arima(truth, forecasts, title="ARIMA", xaxis_label='Time',
               yaxis_label='Value', c1='#A6CEE3', c2='#B2DF8A', 
               forecast_start=None, **kwargs):
    
    # make truth and forecasts into pandas series
    n_truth = truth.shape[0]
    n_forecasts = forecasts.shape[0]
    
    # always plot truth the same
    truth = pd.Series(truth, index=np.arange(truth.shape[0]))
    
    # if no defined forecast start, start at the end
    if forecast_start is None:
        idx = np.arange(n_truth, n_truth + n_forecasts)
    else:
        idx = np.arange(forecast_start, n_forecasts)
    forecasts = pd.Series(forecasts, index=idx)
    
    # set up the plot
    p = figure(title=title, plot_height=400, **kwargs)
    p.grid.grid_line_alpha=0.3
    p.xaxis.axis_label = xaxis_label
    p.yaxis.axis_label = yaxis_label
    
    # add the lines
    p.line(truth.index, truth.values, color=c1, legend='Observed')
    p.line(forecasts.index, forecasts.values, color=c2, legend='Forecasted')
    
    return p









    





    
        
        Loading BokehJS ...



In [8]:

    
in_sample_preds = stepwise_fit.predict_in_sample()
in_sample_preds[:10]









    Out[8]:





array([  -66.61074424, 10098.89197661, 12090.04716934, 16290.75198825,
       15943.17987042, 17257.4782733 , 17797.61571866, 20199.85515377,
       21475.00800993, 20787.12947566])



In [9]:

    
show(plot_arima(wineind, in_sample_preds, 
                title="Original Series & In-sample Predictions", 
                c2='#FF0000', forecast_start=0))

Predicting future values

After your model is fit, you can forecast future values using the predict function, just like in sci-kit learn:



In [10]:

    
next_25 = stepwise_fit.predict(n_periods=25)
next_25









    Out[10]:





array([21966.69715352, 25983.70920565, 30225.30978848, 35417.12496267,
       13011.3618805 , 19639.41027742, 21506.58176159, 23674.33998321,
       21685.76394174, 23669.77818948, 26955.35255979, 22755.2506162 ,
       19808.16131764, 23578.7818493 , 27845.86719673, 32923.89426476,
       10475.6877784 , 17024.74533391, 18831.64797186, 20929.546713  ,
       18876.02414315, 20792.57512611, 24011.97546913, 19745.03906623,
       16731.45362497])



In [11]:

    
# call the plotting func
show(plot_arima(wineind, next_25))

Updating your model

ARIMAs create forecasts by using the latest observations. Over time, your forecasts will drift, and you'll need to update the model with the observed values. There are several solutions to this problem:

Fit a new ARIMA with the new data added to your training sample
- You can either re-use the order discovered in the auto_arima function, or re-run auto_arima altogether.
Use the update method (preferred). This will allow your model to update its parameters by taking several more MLE steps on new observations (controlled by the maxiter arg) starting from the parameters it's already discovered. This approach will help you avoid over-fitting.

For this example, let's update our existing model with the next_25 we just computed, as if they were actually observed values.



In [12]:

    
stepwise_fit.update(next_25, maxiter=10)  # take 10 more steps
stepwise_fit.summary()









    Out[12]:





Statespace Model Results

  Dep. Variable:                   y                  No. Observations:      201   


  Model:            SARIMAX(1, 1, 2)x(0, 1, 1, 12)    Log Likelihood      -1748.493


  Date:                    Mon, 27 May 2019           AIC                 3508.986 


  Time:                        11:25:31               BIC                 3528.404 


  Sample:                          0                  HQIC                3516.853 


                                - 201                                             


  Covariance Type:                opg                                              




               coef      std err       z       P>|z|   [0.025     0.975]  


  intercept   -100.7331     72.099     -1.397   0.162   -242.045     40.578


  ar.L1         -0.5122      0.390     -1.314   0.189     -1.277      0.252


  ma.L1         -0.0806      0.403     -0.200   0.842     -0.871      0.710


  ma.L2         -0.4431      0.224     -1.981   0.048     -0.882     -0.005


  ma.S.L12      -0.4024      0.054     -7.458   0.000     -0.508     -0.297


  sigma2      7.663e+06    7.1e+05     10.789   0.000   6.27e+06   9.05e+06




  Ljung-Box (Q):           54.62    Jarque-Bera (JB):   44.08


  Prob(Q):                 0.06     Prob(JB):           0.00 


  Heteroskedasticity (H):  0.54     Skew:               -0.66


  Prob(H) (two-sided):     0.01     Kurtosis:           4.97 



Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
[2] Covariance matrix is singular or near-singular, with condition number 7.7e+14. Standard errors may be unstable.



In [14]:

    
updated_data = np.concatenate([wineind, next_25])



In [15]:

    
# visualize new forecasts
show(plot_arima(updated_data, stepwise_fit.predict(n_periods=10)))



In [ ]:

Dep. Variable:	y	No. Observations:	176
Model:	SARIMAX(1, 1, 2)x(0, 1, 1, 12)	Log Likelihood	-1527.371
Date:	Mon, 27 May 2019	AIC	3066.742
Time:	11:17:17	BIC	3085.305
Sample:	0	HQIC	3074.278
	- 176
Covariance Type:	opg

	coef	std err	z	P>\|z\|	[0.025	0.975]
intercept	-100.7331	72.197	-1.395	0.163	-242.236	40.770
ar.L1	-0.5123	0.390	-1.312	0.189	-1.277	0.253
ma.L1	-0.0806	0.404	-0.200	0.842	-0.872	0.711
ma.L2	-0.4430	0.224	-1.978	0.048	-0.882	-0.004
ma.S.L12	-0.4025	0.054	-7.448	0.000	-0.508	-0.297
sigma2	7.663e+06	7.3e+05	10.495	0.000	6.23e+06	9.09e+06

Ljung-Box (Q):	48.70	Jarque-Bera (JB):	21.57
Prob(Q):	0.16	Prob(JB):	0.00
Heteroskedasticity (H):	1.18	Skew:	-0.61
Prob(H) (two-sided):	0.54	Kurtosis:	4.31

Ljung-Box (Q):	43.96	Jarque-Bera (JB):	16.51
Prob(Q):	0.31	Prob(JB):	0.00
Heteroskedasticity (H):	1.07	Skew:	-0.58
Prob(H) (two-sided):	0.80	Kurtosis:	4.04

Ljung-Box (Q):	54.62	Jarque-Bera (JB):	44.08
Prob(Q):	0.06	Prob(JB):	0.00
Heteroskedasticity (H):	0.54	Skew:	-0.66
Prob(H) (two-sided):	0.01	Kurtosis:	4.97

auto_arima