Join us for networking & quality resources to help you and your team succeed in digital transformation.

Economists and statisticians who work with time-series data – data collected over many periods rather than one-slice-in-time cross sectional data – will have at least heard of ARCH and GARCH models. Wikipedia has a concise summary of these methods:

“In econometrics, the autoregressive conditional heteroscedasticity (ARCH) model is a statistical model for time series data that describes the variance of the current error term or innovation as a function of the actual sizes of the previous time periods’ error terms; often the variance is related to the squares of the previous innovations.

The ARCH model is appropriate when the error variance in a time series follows an autoregressive (AR) model; if an autoregressive moving average (ARMA) model is assumed for the error variance, the model is a generalized autoregressive conditional heteroskedasticity (GARCH) model.

ARCH models are commonly employed in modeling financial time series that exhibit time-varying volatility and volatility clustering, i.e. periods of swings interspersed with periods of relative calm. ARCH-type models are sometimes considered to be in the family of stochastic volatility models, although this is strictly incorrect since at time t the volatility is completely pre-determined (deterministic) given previous values”

In *Introduction to Time Series Using Stata*, Sean Becketti describes some practical reasons why they are used:

“Financial markets, among other things, produce ‘price discovery’, that is, asset prices that balance the willingness of relatively bullish investors to buy with the willingness of relatively bearish investors to sell. Of course, news arrives every day that shifts the balance of investor opinion about an asset’s likely future performance.

Some news— the unemployment and inflation rates, the balance of trade, and the like— affects investor views of the market as a whole, while other news— changes in management, updated sales figures, etc.— affects only a specific sector or company. Investors digest this information, update their willingness to buy or sell assets, and their actions change prices.

The types of information that affect asset prices are many and varied, and it is impractical to model them all. As a result, their impact on prices is part of the random component of our models, and the typical magnitude of their impact helps determine the magnitude of the residual variance.

During uneventful periods, little news of import arrives, so investor opinions change relatively little. Moreover, long periods of relatively stable prices tend to narrow the range of investor opinions of asset values. In contrast, sudden and unexpected news can shift investor opinion significantly.

More importantly, extraordinary events can sow confusion among investors. Investors may struggle for some time to settle on an updated opinion of value. In these circumstances, investors may exhibit heightened reactions to scraps of information that either increase concern or restore confidence.

During these periods, the variance of the random component of prices is increased— at least, that is the way these events are reflected in our models, which, after all, are simplified representations of reality.”

The Stata software *time-series reference manual* covers a wide range of topics and includes a glossary, which I’ve reproduced in edited form below.

**Any copy/paste and editing errors are mine.**

*______________________________________________________*

**add factor**. An add factor is a quantity added to an endogenous variable in a forecast model. Add factors can be used to incorporate outside information into a model, and they can be used to produce forecasts under alternative scenarios.

**ARCH model.** An autoregressive conditional heteroskedasticity (ARCH) model is a regression model in which the conditional variance is modeled as an autoregressive (AR) process. Although the conditional variance changes over time, the unconditional variance is time invariant because yt is a stationary process. Modeling the conditional variance as an AR process raises the implied unconditional variance, making this model particularly appealing to researchers modeling fat-tailed data, such as ﬁnancial data.

**ARFIMA model.** An autoregressive fractionally integrated moving-average (ARFIMA) model is a time series model suitable for use with long-memory processes. ARFIMA models generalize autoregressive integrated moving-average (ARIMA) models by allowing the differencing parameter to be a real number in (−0.5,0.5) instead of requiring it to be an integer.

**ARIMA model.** An autoregressive integrated moving-average (ARIMA) model is a time-series model suitable for use with integrated processes. In an ARIMA(p,d,q) model, the data are differenced d times to obtain a stationary series, and then an ARMA(p,q) model is ﬁt to this differenced data. ARIMA models that include exogenous explanatory variables are known as ARMAX models.

**ARMA model.** An autoregressive moving-average (ARMA) model is a time-series model in which the current period’s realization is the sum of an autoregressive (AR) process and a moving-average (MA) process. An ARMA(p,q) model includes p AR terms and q MA terms. ARMA models with just a few lags are often able to ﬁt data as well as pure AR or MA models with many more lags.

**ARMA process.** An autoregressive moving average (ARMA) process is a time series in which the current value of the variable is a linear function of its own past values and a weighted average of current and past realizations of a white-noise process. It consists of an autoregressive component and a moving-average component; see autoregressive (AR) process and moving-average (MA) process.

**ARMAX model.** An ARMAX model is a time-series model in which the current period’s realization is an ARMA process plus a linear function of a set of exogenous variables. Equivalently, an ARMAX model is a linear regression model in which the error term is speciﬁed to follow an ARMA process.

**Autocorrelation function.** The autocorrelation function (ACF) expresses the correlation between periods t and t−k of a time series as function of the time t and the lag k. For a stationary time series, the ACF does not depend on t and is symmetric about k = 0, meaning that the correlation between periods t and t−k is equal to the correlation between periods t and t + k.

**autoregressive (AR) process.** An autoregressive (AR) process is a time series in which the current value of a variable is a linear function of its own past values and a white-noise error term. An AR(p) model contains p lagged values of the dependent variable.

**Cochrane–Orcutt estimator.** This estimation is a linear regression estimator that can be used when the error term exhibits ﬁrst-order autocorrelation. An initial estimate of the autocorrelation parameter ρ is obtained from OLS residuals, and then OLS is performed on the transformed data.

**cointegrating vector.** A cointegrating vector speciﬁes a stationary linear combination of nonstationary variables.

**conditional variance.** Although the conditional variance is simply the variance of a conditional distribution, in time-series analysis the conditional variance is often modeled as an autoregressive (AR) process, giving rise to ARCH models.

**correlogram.** A correlogram is a table or graph showing the sample autocorrelations or partial autocorrelations of a time series.

**covariance stationary process.** A process is covariance stationary if the mean of the process is ﬁnite and independent of t, the unconditional variance of the process is ﬁnite and independent of t, and the covariance between periods t and t−s is ﬁnite and depends on t−s but not on t or s themselves. Covariance stationary processes are also known as weakly stationary processes. See also stationary process.

**cross-correlation function.** The cross-correlation function expresses the correlation between one series at time t and another series at time t−k as a function of the time t and lag k. If both series are stationary, the function does not depend on t.

**cyclical component.** A cyclical component is a part of a time series that is a periodic function of time. Deterministic functions of time are deterministic cyclical components, and random functions of time are stochastic cyclical components. For example, ﬁxed seasonal effects are deterministic cyclical components and random seasonal effects are stochastic seasonal components.

**Deterministic trend.** A deterministic trend is a deterministic function of time that speciﬁes the long-run tendency of a time series.

**difference operator.** The difference operator ∆ denotes the change in the value of a variable from period t − 1 to period t.

**dynamic forecast.** A dynamic forecast uses forecast values wherever lagged values of the endogenous variables appear in the model, allowing one to forecast multiple periods into the future. See also static forecast.

**dynamic-multiplier function.** A dynamic-multiplier function measures the effect of a shock to an exogenous variable on an endogenous variable. The kth dynamic-multiplier function of variable i on variable j measures the effect on variable j in period t+k in response to a one-unit shock to variable i in period t, holding everything else constant.

**endogenous variable.** An endogenous variable is a regressor that is correlated with the unobservable error term. Equivalently, an endogenous variable is one whose values are determined by the equilibrium or outcome of a structural model.

**exogenous variable.** An exogenous variable is a regressor that is not correlated with any of the unobservable error terms in the model. Equivalently, an exogenous variable is one whose values change independently of the other variables in a structural model.

**exponential smoothing.** Exponential smoothing is a method of smoothing a time series in which the smoothed value at period t is equal to a fraction α of the series value at time t plus a fraction 1−α of the previous period’s smoothed value. The fraction α is known as the smoothing parameter.

**forecast-error variance decomposition.** Forecast-error variance decompositions measure the fraction of the error in forecasting variable i after h periods that is attributable to the orthogonalized shocks to variable j.

**forward operator.** The forward operator F denotes the value of a variable at time t + 1. A forward operator is also known as a lead operator.

**frequency-domain analysis**. Frequency-domain analysis is analysis of time-series data by considering its frequency properties. The spectral density function and the spectral distribution function are key components of frequency-domain analysis, so it is often called spectral analysis.

**gain (of a linear ﬁlter).** The gain of a linear ﬁlter scales the spectral density of the unﬁltered series into the spectral density of the ﬁltered series for each frequency. Speciﬁcally, at each frequency, multiplying the spectral density of the unﬁltered series by the square of the gain of a linear ﬁlter yields the spectral density of the ﬁltered series. If the gain at a particular frequency is 1, the ﬁltered and unﬁltered spectral densities are the same at that frequency and the corresponding stochastic cycles are passed through perfectly. If the gain at a particular frequency is 0, the ﬁlter removes all the corresponding stochastic cycles from the unﬁltered series.

**GARCH model.** A generalized autoregressive conditional heteroskedasticity (GARCH) model is a regression model in which the conditional variance is modeled as an ARMA process. GARCH models are often used because the ARMA speciﬁcation often allows the conditional variance to be modeled with fewer parameters than are required by a pure ARCH model. Many extensions to the basic GARCH model exist.

**generalized least-squares estimator.** A generalized least-squares (GLS) estimator is used to estimate the parameters of a regression function when the error term is heteroskedastic or autocorrelated. In the linear case, GLS is sometimes described as “OLS on transformed data” because the GLS estimator can be implemented by applying an appropriate transformation to the dataset and then using OLS.

**Granger causality.** The variable x is said to Granger-cause variable y if, given the past values of y, past values of x are useful for predicting y.

**high-pass ﬁlter.** Time-series ﬁlters are designed to pass or block stochastic cycles at speciﬁed frequencies. High-pass ﬁlters pass through stochastic cycles above the cutoff frequency and block all other stochastic cycles.

**Holt–Winters smoothing.** A set of methods for smoothing time-series data that assume that the value of a time series at time t can be approximated as the sum of a mean term that drifts over time, as well as a time trend whose strength also drifts over time. Variations of the basic method allow for seasonal patterns in data, as well.

**impulse–response function.** An impulse–response function (IRF) measures the effect of a shock to an endogenous variable on itself or another endogenous variable. The kth impulse–response function of variable i on variable j measures the effect on variable j in period t + k in response to a one-unit shock to variable i in period t, holding everything else constant.

**independent and identically distributed.** A series of observations is independent and identically distributed (i.i.d.) if each observation is an independent realization from the same underlying distribution. In some contexts, the deﬁnition is relaxed to mean only that the observations are independent and have identical means and variances.

**integrated process.** A nonstationary process is integrated of order d, written I(d), if the process must be differenced d times to produce a stationary series.

**Kalman ﬁlter.** The Kalman ﬁlter is a recursive procedure for predicting the state vector in a state-space model.

**lag operator.** The lag operator L denotes the value of a variable at time t−1.

**linear ﬁlter.** A linear ﬁlter is a sequence of weights used to compute a weighted average of a time series at each time period.

**long-memory process.** A long-memory process is a stationary process whose autocorrelations decay at a slower rate than a short-memory process. ARFIMA models are typically used to represent long-memory processes, and ARMA models are typically used to represent short-memory processes.

**moving-average (MA) process.** A moving-average (MA) process is a time-series process in which the current value of a variable is modeled as a weighted average of current and past realizations of a white-noise process and, optionally, a time-invariant constant. By convention, the weight on the current realization of the white-noise process is equal to one, and the weights on the past realizations are known as the MA coefﬁcients.

**multivariate GARCH models.** Multivariate GARCH models are multivariate time-series models in which the conditional covariance matrix of the errors depends on its own past and its past shocks. The acute trade-off between parsimony and ﬂexibility has given rise to a plethora of models.

**Newey–West covariance matrix**. The Newey–West covariance matrix is a member of the class of heteroskedasticity- and autocorrelation-consistent (HAC) covariance matrix estimators used with time-series data that produces covariance estimates that are robust to both arbitrary heteroskedasticity and autocorrelation up to a prespeciﬁed lag.

**orthogonalized impulse–response function.** An orthogonalized impulse–response function (OIRF) measures the effect of an orthogonalized shock to an endogenous variable on itself or another endogenous variable. An orthogonalized shock is one that affects one variable at time t but no other variables.

**output gap.** The output gap, sometimes called the GDP gap, is the difference between the actual output of an economy and its potential output.

**partial autocorrelation function.** The partial autocorrelation function (PACF) expresses the correlation between periods t and t−k of a time series as a function of the time t and lag k, after controlling for the effects of intervening lags. For a stationary time series, the PACF does not depend on t. The PACF is not symmetric about k = 0: the partial autocorrelation between yt and yt−k is not equal to the partial autocorrelation between yt and yt+k.

**periodogram.** A periodogram is a graph of the spectral density function of a time series as a function of frequency. Peaks in the periodogram represent cyclical behavior in the data.

**phase function.** The phase function of a linear ﬁlter speciﬁes how the ﬁlter changes the relative importance of the random components at different frequencies in the frequency domain.

**Phillips curve.** The Phillips curve is a macroeconomic relationship between inﬂation and economic activity, usually expressed as an equation involving inﬂation and the output gap. Historically, the Phillips curve describes an inverse relationship between the unemployment rate and the rate of rises in wages.

**portmanteau statistic.** The portmanteau, or Q, statistic is used to test for white noise and is calculated using the ﬁrst m autocorrelations of the series, where m is chosen by the user. Under the null hypothesis that the series is a white-noise process, the portmanteau statistic has a χ2 distribution with m degrees of freedom.

**Prais–Winsten estimator.** A Prais–Winsten estimator is a linear regression estimator that is used when the error term exhibits ﬁrst-order autocorrelation; see also Cochrane–Orcutt estimator. Here the ﬁrst observation in the dataset is transformed so that the ﬁrst observation is not lost. The Prais–Winsten estimator is a generalized least-squares estimator.

**random walk.** A random walk is a time-series process in which the current period’s realization is equal to the previous period’s realization plus a white-noise error term. A random walk with drift also contains a nonzero time-invariant constant. The constant term δ is known as the drift parameter. An important property of random-walk processes is that the best predictor of the value at time t + 1 is the value at time t plus the value of the drift parameter.

**recursive regression analysis.** A recursive regression analysis involves performing a regression at time t by using all available observations from some starting time t0 through time t, performing another regression at time t + 1 by using all observations from time t0 through time t + 1, and so on. Unlike a rolling regression analysis, the ﬁrst period used for all regressions is held ﬁxed.

**regressand.** The regressand is the variable that is being explained or predicted in a regression model. Synonyms include dependent variable, left-hand-side variable, and endogenous variable. regressor. Regressors are variables in a regression model used to predict the regressand. Synonyms include independent variable, right-hand-side variable, explanatory variable, predictor variable, and exogenous variable.

**rolling regression analysis.** A rolling, or moving window, regression analysis involves performing regressions for each period by using the most recent m periods’ data, where m is known as the window size.

**seasonal difference operator.** The period-s seasonal difference operator ∆s denotes the difference in the value of a variable at time t and time t−s.

**serial correlation.** Serial correlation refers to regression errors that are correlated over time. If a regression model does not contain lagged dependent variables as regressors, the OLS estimates are consistent in the presence of mild serial correlation, but the covariance matrix is incorrect. When the model includes lagged dependent variables and the residuals are serially correlated, the OLS estimates are biased and inconsistent.

**serial correlation tests.** Because OLS estimates are at least inefﬁcient and potentially biased in the presence of serial correlation, econometricians have developed many tests to detect it. Popular ones include the Durbin–Watson (1950, 1951, 1971) test, the Breusch–Pagan (1980) test, and Durbin’s (1970) alternative test.

**smoothing.** Smoothing a time series refers to the process of extracting an overall trend in the data. The motivation behind smoothing is the belief that a time series exhibits a trend component as well as an irregular component and that the analyst is interested only in the trend component. Some smoothers also account for seasonal or other cyclical patterns.

**spectral density function.** The spectral density function is the derivative of the spectral distribution function. Intuitively, the spectral density function f(ω) indicates the amount of variance in a time series that is attributable to sinusoidal components with frequency ω. The spectral density function is sometimes called the spectrum.

**spectral distribution function.** The (normalized) spectral distribution function F(ω) of a process describes the proportion of variance that can be explained by sinusoids with frequencies in the range (0,ω), where 0≤ ω ≤ π. The spectral distribution and density functions used in frequency domain analysis are closely related to the autocorrelation function used in time-domain analysis.

**state-space model.** A state-space model describes the relationship between an observed time series and an unobservable state vector that represents the “state” of the world. The measurement equation expresses the observed series as a function of the state vector, and the transition equation describes how the unobserved state vector evolves over time. By deﬁning the parameters of the measurement and transition equations appropriately, one can write a wide variety of time-series models in the state-space form.

**Static forecast**. A static forecast uses actual values wherever lagged values of the endogenous variables appear in the model. As a result, static forecasts perform at least as well as dynamic forecasts, but static forecasts cannot produce forecasts into the future if lags of the endogenous variables appear in the model. Because actual values will be missing beyond the last historical time period in the dataset, static forecasts can forecast only one period into the future (assuming only ﬁrst lags appear in the model); thus they are often called one-step-ahead forecasts.

**stationary process.** A process is stationary if the joint distribution of y1,…,yk is the same as the joint distribution of y1+τ,…,yk+τ for all k and τ. Intuitively, shifting the origin of the series by τ units has no effect on the joint distributions; the marginal distribution of the series does not change over time. A stationary process is also known as a strictly stationary process or a strongly stationary process. See also covariance stationary process.

**steady-state equilibrium.** The steady-state equilibrium is the predicted value of a variable in a dynamic model, ignoring the effects of past shocks, or, equivalently, the value of a variable, assuming that the effects of past shocks have fully died out and no longer affect the variable of interest.

**stochastic cycle.** A stochastic cycle is a cycle characterized by an amplitude, phase, or frequency that can be random functions of time. See cyclical component.

**stochastic equation.** A stochastic equation, in contrast to an identity, is an equation in a forecast model that includes a random component, most often in the form of an additive error term. Stochastic equations include parameters that must be estimated from historical data.

**stochastic trend.** A stochastic trend is a nonstationary random process. Unit-root process and random coefﬁcients on time are two common stochastic trends.

**structural model.** In time-series analysis, a structural model is one that describes the relationship among a set of variables, based on underlying theoretical considerations. Structural models may contain both endogenous and exogenous variables.

**SVAR.** A structural vector autoregressive (SVAR) model is a type of VAR in which short- or long-run constraints are placed on the resulting impulse–response functions. The constraints are usually motivated by economic theory and therefore allow causal interpretations of the IRFs to be made.

**time-domain analysis.** Time-domain analysis is analysis of data viewed as a sequence of observations observed over time. The autocorrelation function, linear regression, ARCH models, and ARIMA models are common tools used in time-domain analysis.

**trend.** The trend speciﬁes the long-run behavior in a time series. The trend can be deterministic or stochastic. Many economic, biological, health, and social time series have long-run tendencies to increase or decrease. Before the 1980s, most time-series analysis speciﬁed the long-run tendencies as deterministic functions of time. Since the 1980s, the stochastic trends implied by unit-root processes have become a standard part of the toolkit.

**unit-root process.** A unit-root process is one that is integrated of order one, meaning that the process is nonstationary but that ﬁrst-differencing the process produces a stationary series. The simplest example of a unit-root process is the random walk.

**unit-root tests.** Whether a process has a unit root has both important statistical and economic ramiﬁcations, so a variety of tests have been developed to test for them. Among the earliest tests proposed is the one by Dickey and Fuller (1979), though most researchers now use an improved variant called the augmented Dickey–Fuller test instead of the original version. Other common unit-root tests include the DF–GLS test of Elliott, Rothenberg, and Stock (1996) and the Phillips–Perron (1988) test. Variants of unit-root tests suitable for panel data have also been developed.

**VAR.** A vector autoregressive (VAR) model is a multivariate regression technique in which each dependent variable is regressed on lags of itself and on lags of all the other dependent variables in the model. Occasionally, exogenous variables are also included in the model.

**VECM.** A vector error-correction model (VECM) is a type of VAR that is used with variables that are cointegrated. Although ﬁrst-differencing variables that are integrated of order one makes them stationary, ﬁtting a VAR to such ﬁrst-differenced variables results in misspeciﬁcation error if the variables are cointegrated.

**white-noise process.** A variable ut represents a white-noise process if the mean of ut is zero, the variance of ut is σ2, and the covariance between ut and us is zero for all s != t. ** **

**Yule–Walker equations.** TheYule–Walker equations are a set of difference equations that describe the relationship among the autocovariances and autocorrelations of an autoregressive moving-average (ARMA) process.

**Source:** Stata 16 *time-series reference manual*, StataCorp

Article by channel:

Everything you need to know about Digital Transformation

The best articles, news and events direct to your inbox