ARIMA

References:

Sections:

  1. Conceptual Approach
  2. Assumptions
  3. Limitations / Disadvantages
  4. Example Code

1. Conceptual Approach

ARIMA stands for Auto Regressive Integrated Moving Average. There are 2 parameters for a basic ARIMA model: There are multiple parts to ARIMA:

  • Auto Regressive
  • Residuals

There are two widely used linear time series models in literature, viz. Autoregressive (AR) and Moving Average (MA) models. ARIMA combines the two into a single model.

An AR model, as the name implies, is a regression on the previous values of the series, to predict the next value.

A $AP(p)$ model is given by

$$ X_t = c + \Sigma{_{i=1}^{p}\phi_i X_{t-i} }$$

The notation $MA(q)$ refers to the moving average model of order q: $$ X_t = \mu + \Sigma{_{i=1}^{q}\theta_i \varepsilon_{t-i} }$$

where $\mu$ is the mean of the series, the $\theta_1, ..., \theta_q$ are the parameters of the model and the $\varepsilon_t, \varepsilon_{t−1},..., \varepsilon_{t−q}$ are white noise error terms

AR and MA models can be effectively combined together to form a general and useful class of time series models, known as the ARMA models.

$$ X_t = c + \mu + \Sigma{_{i=1}^{p}\phi_i X_{t-i} } + \Sigma{_{i=1}^{q}\theta_i \varepsilon_{t-i} } $$

The lag or backshift operator is defined as $L_{y_t} = y_{t-1} $

The ARMA models, described above can only be used for stationary time series data. However in practice many time series such as those related to socio-economic and business show non-stationary behavior. Time series, which contain trend and seasonal patterns, are also non-stationary in nature. Thus from application view point ARMA models are inadequate to properly describe non-stationary time series, which are frequently encountered in practice. For this reason the ARIMA model is proposed, which is a generalization of an ARMA model to include the case of non-stationarity as well. In ARIMA models a non-stationary time series is made stationary by applying finite differencing of the data points. The mathematical formulation of the ARIMA(p,d,q) model using lag polynomials is given below.

2. Assumptions

  • for ARMA, series must be stationary
  • for ARIMA, the differenced series must be stationary
  • Most of the data should be non-zero

3. Limitations / Disadvantages

  • Stationarity conditions most likely won't hold in real datasets
  • Doesn't consider other variables i.e it's auto-regressive

4. Example Code


In [6]:
library(forecast)
library(fpp)

In [7]:
fit <- Arima(usconsumption[,1], order=c(0,0,3))

In [8]:
plot(forecast(fit))