Searches through the model space specified in the specials to identify the
best ARIMA model, with the lowest AIC, AICc or BIC value. It is implemented
using `stats::arima()`

and allows ARIMA models to be used in the fable
framework.

ARIMA( formula, ic = c("aicc", "aic", "bic"), selection_metric = function(x) x[[ic]], stepwise = TRUE, greedy = TRUE, approximation = NULL, order_constraint = p + q + P + Q <= 6 & (constant + d + D <= 2), unitroot_spec = unitroot_options(), trace = FALSE, ... )

formula | Model specification (see "Specials" section). |
---|---|

ic | The information criterion used in selecting the model. |

selection_metric | A function used to compute a metric from an |

stepwise | Should stepwise be used? (Stepwise can be much faster) |

greedy | Should the stepwise search move to the next best option immediately? |

approximation | Should CSS (conditional sum of squares) be used during model
selection? The default ( |

order_constraint | A logical predicate on the orders of |

unitroot_spec | A specification of unit root tests to use in the
selection of |

trace | If |

... | Further arguments for |

A model specification.

The fable `ARIMA()`

function uses an alternate parameterisation of constants
to `stats::arima()`

and `forecast::Arima()`

. While the parameterisations
are equivalent, the coefficients for the constant/mean will differ.

In `fable`

, the parameterisation used is:

$$(1-\phi_1B - \cdots - \phi_p B^p)(1-B)^d y_t = c + (1 + \theta_1 B + \cdots + \theta_q B^q)\varepsilon_t$$

In stats and forecast, an ARIMA model is parameterised as:

$$(1-\phi_1B - \cdots - \phi_p B^p)(y_t' - \mu) = (1 + \theta_1 B + \cdots + \theta_q B^q)\varepsilon_t$$

where \(\mu\) is the mean of \((1-B)^d y_t\) and \(c = \mu(1-\phi_1 - \cdots - \phi_p )\).

The *specials* define the space over which `ARIMA`

will search for the model that best fits the data. If the RHS of `formula`

is left blank, the default search space is given by `pdq() + PDQ()`

: that is, a model with candidate seasonal and nonseasonal terms, but no exogenous regressors. Note that a seasonal model requires at least 2 full seasons of data; if this is not available, `ARIMA`

will revert to a nonseasonal model with a warning.

To specify a model fully (avoid automatic selection), the intercept and `pdq()/PDQ()`

values must be specified. For example, `formula = response ~ 1 + pdq(1, 1, 1) + PDQ(1, 0, 0)`

.

The `pdq`

special is used to specify non-seasonal components of the model.

pdq(p = 0:5, d = 0:2, q = 0:5, p_init = 2, q_init = 2, fixed = list())

`p` | The order of the non-seasonal auto-regressive (AR) terms. If multiple values are provided, the one which minimises `ic` will be chosen. |

`d` | The order of integration for non-seasonal differencing. If multiple values are provided, one of the values will be selected via repeated KPSS tests. |

`q` | The order of the non-seasonal moving average (MA) terms. If multiple values are provided, the one which minimises `ic` will be chosen. |

`p_init` | If `stepwise = TRUE` , `p_init` provides the initial value for `p` for the stepwise search procedure. |

`q_init` | If `stepwise = TRUE` , `q_init` provides the initial value for `q` for the stepwise search procedure. |

`fixed` | A named list of fixed parameters for coefficients. The names identify the coefficient, beginning with either `ar` or `ma` , followed by the lag order. For example, `fixed = list(ar1 = 0.3, ma2 = 0)` . |

The `PDQ`

special is used to specify seasonal components of the model. To force a non-seasonal fit, specify `PDQ(0, 0, 0)`

in the RHS of the model formula. Note that simply omitting `PDQ`

from the formula will *not* result in a non-seasonal fit.

PDQ(P = 0:2, D = 0:1, Q = 0:2, period = NULL, P_init = 1, Q_init = 1, fixed = list())

`P` | The order of the seasonal auto-regressive (SAR) terms. If multiple values are provided, the one which minimises `ic` will be chosen. |

`D` | The order of integration for seasonal differencing. If multiple values are provided, one of the values will be selected via repeated heuristic tests (based on strength of seasonality from an STL decomposition). |

`Q` | The order of the seasonal moving average (SMA) terms. If multiple values are provided, the one which minimises `ic` will be chosen. |

`period` | The periodic nature of the seasonality. This can be either a number indicating the number of observations in each seasonal period, or text to indicate the duration of the seasonal window (for example, annual seasonality would be "1 year"). |

`P_init` | If `stepwise = TRUE` , `P_init` provides the initial value for `P` for the stepwise search procedure. |

`Q_init` | If `stepwise = TRUE` , `Q_init` provides the initial value for `Q` for the stepwise search procedure. |

`fixed` | A named list of fixed parameters for coefficients. The names identify the coefficient, beginning with either `sar` or `sma` , followed by the lag order. For example, `fixed = list(sar1 = 0.1)` . |

Exogenous regressors can be included in an ARIMA model without explicitly using the `xreg()`

special. Common exogenous regressor specials as specified in `common_xregs`

can also be used. These regressors are handled using `stats::model.frame()`

, and so interactions and other functionality behaves similarly to `stats::lm()`

.

The inclusion of a constant in the model follows the similar rules to `stats::lm()`

, where including `1`

will add a constant and `0`

or `-1`

will remove the constant. If left out, the inclusion of a constant will be determined by minimising `ic`

.

xreg(..., fixed = list())

`...` | Bare expressions for the exogenous regressors (such as `log(x)` ) |

`fixed` | A named list of fixed parameters for coefficients. The names identify the coefficient, and should match the name of the regressor. For example, `fixed = list(constant = 20)` . |

Forecasting: Principles and Practices, ARIMA models (chapter 9) Forecasting: Principles and Practices, Dynamic regression models (chapter 10)

# Manual ARIMA specification USAccDeaths %>% as_tsibble() %>% model(arima = ARIMA(log(value) ~ 0 + pdq(0, 1, 1) + PDQ(0, 1, 1))) %>% report()#> Series: value #> Model: ARIMA(0,1,1)(0,1,1)[12] #> Transformation: log(value) #> #> Coefficients: #> ma1 sma1 #> -0.4713 -0.5926 #> s.e. 0.1230 0.1933 #> #> sigma^2 estimated as 0.001379: log likelihood=109.31 #> AIC=-212.63 AICc=-212.19 BIC=-206.39#> #>#>#> #>#> #>#>#> #>#>#> #>tsibbledata::global_economy %>% filter(Country == "Australia") %>% model(ARIMA(log(GDP) ~ Population))#> Warning: NaNs produced#> # A mable: 1 x 2 #> # Key: Country [1] #> Country `ARIMA(log(GDP) ~ Population)` #> <fct> <model> #> 1 Australia <LM w/ ARIMA(2,0,0) errors>