ARIMA

References:

Sections:

  1. Conceptual Approach
  2. Assumptions
  3. Limitations / Disadvantages
  4. Example Code

1. Conceptual Approach

ARIMA stands for Auto Regressive Integrated Moving Average. There are 2 parameters for a basic ARIMA model: There are multiple parts to ARIMA:

  • Auto Regressive
  • Residuals

There are two widely used linear time series models in literature, viz. Autoregressive (AR) and Moving Average (MA) models. ARIMA combines the two into a single model.

An AR model, as the name implies, is a regression on the previous values of the series, to predict the next value.

A $AP(p)$ model is given by

$$ X_t = c + \Sigma{_{i=1}^{p}\phi_i X_{t-i} }$$

The notation $MA(q)$ refers to the moving average model of order q: $$ X_t = \mu + \Sigma{_{i=1}^{q}\theta_i \varepsilon_{t-i} }$$

where $\mu$ is the mean of the series, the $\theta_1, ..., \theta_q$ are the parameters of the model and the $\varepsilon_t, \varepsilon_{t−1},..., \varepsilon_{t−q}$ are white noise error terms

AR and MA models can be effectively combined together to form a general and useful class of time series models, known as the ARMA models.

$$ X_t = c + \mu + \Sigma{_{i=1}^{p}\phi_i X_{t-i} } + \Sigma{_{i=1}^{q}\theta_i \varepsilon_{t-i} } $$

The lag or backshift operator is defined as $L_{y_t} = y_{t-1} $

The ARMA models, described above can only be used for stationary time series data. However in practice many time series such as those related to socio-economic and business show non-stationary behavior. Time series, which contain trend and seasonal patterns, are also non-stationary in nature. Thus from application view point ARMA models are inadequate to properly describe non-stationary time series, which are frequently encountered in practice. For this reason the ARIMA model is proposed, which is a generalization of an ARMA model to include the case of non-stationarity as well. In ARIMA models a non-stationary time series is made stationary by applying finite differencing of the data points. The mathematical formulation of the ARIMA(p,d,q) model using lag polynomials is given below.

ARIMA Equation

2. Assumptions

  • for ARMA, series must be stationary
  • for ARIMA, the differenced series must be stationary
  • Most of the data should be non-zero

3. Limitations / Disadvantages

  • Stationarity conditions most likely won't hold in real datasets
  • Doesn't consider other variables i.e it's auto-regressive

4. Example Code


In [6]:
library(forecast)
library(fpp)

In [7]:
fit <- Arima(usconsumption[,1], order=c(0,0,3))

In [8]:
plot(forecast(fit))