Stationary processes and autocorrelation are foundational concepts in time series analysis. They give actuaries the tools to model dependencies in data that evolve over time, which directly feeds into pricing, reserving, and risk management decisions.

A stationary process has statistical properties that don't shift as time passes, and autocorrelation measures how observations at different time points relate to each other. Together, these ideas underpin most of the time series models you'll encounter in actuarial work.

Definition of stationary processes

A stationary process is a stochastic process whose statistical properties remain constant over time. The mean, variance, and autocovariance structure don't depend on when you observe the process. This property is what makes stationary processes tractable for modeling: if the rules governing the data don't change, you can use past behavior to forecast the future.

Strict vs weak stationarity

There are two levels of stationarity, and the distinction matters.

Strict stationarity requires that the entire joint probability distribution is invariant under time shifts. Formally, a process $\{X_t\}$ is strictly stationary if $(X_{t_1}, \ldots, X_{t_n})$ has the same distribution as $(X_{t_1+h}, \ldots, X_{t_n+h})$ for all choices of $t_1, \ldots, t_n$ and any shift $h$ . This is a very strong condition because it constrains every moment and every joint relationship, not just the first two.

Weak stationarity (also called second-order or covariance stationarity) is less demanding. It only requires:

Constant mean: $E[X_t] = \mu$ for all $t$
Time-invariant autocovariance: $Cov(X_t, X_{t+h}) = \gamma(h)$ , depending only on the lag $h$ , not on $t$

Strict stationarity implies weak stationarity (assuming finite second moments), but the reverse isn't generally true. The notable exception is Gaussian processes, where weak stationarity does imply strict stationarity, because a Gaussian distribution is fully determined by its first two moments.

In practice, weak stationarity is the version you'll work with most often, since it's both verifiable from data and sufficient for fitting ARMA-type models.

Properties of stationary processes

The mean is constant over time: $E[X_t] = \mu$ for all $t$
The variance is constant over time: $Var(X_t) = \gamma(0)$ for all $t$
The autocovariance depends only on the lag: $Cov(X_t, X_{t+h}) = \gamma(h)$
The autocorrelation depends only on the lag: $Corr(X_t, X_{t+h}) = \rho(h)$

These properties make stationary processes well-suited for modeling phenomena with stable long-term behavior. Think of daily log-returns on a stock index or short-term interest rate changes: the level fluctuates, but the overall statistical character stays roughly the same.

Autocorrelation in stationary processes

Autocorrelation measures the linear dependence between a time series and lagged versions of itself. For actuaries, this is how you quantify whether today's observation tells you something about tomorrow's (or next month's, or next year's).

Definition of autocorrelation

The autocorrelation at lag $h$ is defined as:

$\rho(h) = Corr(X_t, X_{t+h}) = \frac{\gamma(h)}{\gamma(0)}$

Here $\gamma(h)$ is the autocovariance at lag $h$ and $\gamma(0)$ is the variance. Dividing by the variance normalizes the measure to fall between $-1$ and $1$ , making it comparable across different series regardless of scale.

Autocorrelation function (ACF)

The autocorrelation function (ACF) plots $\rho(h)$ against the lag $h$ . It gives you a visual snapshot of the entire dependence structure of a stationary series. You can use it to:

Identify how quickly the influence of a shock decays
Spot seasonal or cyclical patterns (spikes at regular lag intervals)
Guide model selection (e.g., the ACF of an MA( $q$ ) process cuts off after lag $q$ )

Properties of ACF

Symmetry: $\rho(h) = \rho(-h)$
Normalized at zero: $\rho(0) = 1$ (a series is perfectly correlated with itself)
Bounded: $-1 \leq \rho(h) \leq 1$
White noise signature: For a white noise process, $\rho(h) = 0$ for all $h \neq 0$

Sample ACF vs population ACF

The population ACF is a theoretical quantity. In practice, you estimate it from data using the sample ACF:

$\hat{\rho}(h) = \frac{\sum_{t=1}^{n-h} (X_t - \bar{X})(X_{t+h} - \bar{X})}{\sum_{t=1}^{n} (X_t - \bar{X})^2}$

where $\bar{X}$ is the sample mean and $n$ is the number of observations. This estimator is consistent, meaning it converges to the true $\rho(h)$ as $n \to \infty$ . However, for small samples or large lags (where $h$ is close to $n$ ), the estimate becomes unreliable because fewer pairs of observations contribute to the sum.

Autocovariance in stationary processes

Autocovariance captures the same dependence information as autocorrelation but on the original scale of the data (it's not normalized). It plays a central role in the theoretical development of time series models.

Definition of autocovariance

The autocovariance at lag $h$ is:

$\gamma(h) = Cov(X_t, X_{t+h}) = E[(X_t - \mu)(X_{t+h} - \mu)]$

where $\mu$ is the process mean. Unlike autocorrelation, autocovariance retains the units of the original data (squared), so its magnitude depends on the scale of the series.

Strict vs weak stationarity, Ergodic stationary process | The Blue Dot

Autocovariance function (ACVF)

The autocovariance function (ACVF) plots $\gamma(h)$ against the lag $h$ . It characterizes the second-order properties of a stationary process and tells you both the direction and magnitude of dependence at each lag.

Properties of ACVF

Symmetry: $\gamma(h) = \gamma(-h)$
Variance at lag 0: $\gamma(0) = Var(X_t)$
Bounded by variance: $|\gamma(h)| \leq \gamma(0)$ for all $h$
White noise: $\gamma(h) = 0$ for all $h \neq 0$

Relationship between ACF and ACVF

The ACF is simply the ACVF normalized by the variance:

$\rho(h) = \frac{\gamma(h)}{\gamma(0)}$

And conversely:

$\gamma(h) = \rho(h) \cdot \gamma(0)$

Use the ACF when you want a scale-free measure for comparing dependence across different series. Use the ACVF when you need the actual covariance values, for instance when writing out the variance of a forecast.

Estimation of ACF and ACVF

Estimating these functions from observed data is a critical step in time series analysis. The estimates guide model selection and parameter estimation.

Estimating ACF from data

The sample ACF at lag $h$ is:

$\hat{\rho}(h) = \frac{\sum_{t=1}^{n-h} (X_t - \bar{X})(X_{t+h} - \bar{X})}{\sum_{t=1}^{n} (X_t - \bar{X})^2}$

Note that the denominator sums over all $n$ observations (not $n - h$ ). This ensures the sample ACF matrix remains positive semi-definite, which is a necessary property for valid covariance structures.

Estimating ACVF from data

The sample ACVF at lag $h$ is:

$\hat{\gamma}(h) = \frac{1}{n} \sum_{t=1}^{n-h} (X_t - \bar{X})(X_{t+h} - \bar{X})$

Again, dividing by $n$ (rather than $n - h$ ) preserves positive semi-definiteness. Both the sample ACF and ACVF are consistent estimators of their population counterparts.

Confidence intervals for ACF and ACVF

Under the null hypothesis that the true process is white noise, the sample ACF at each lag is approximately normally distributed with mean 0 and variance $\frac{1}{n}$ . This gives the familiar confidence bands at $\pm \frac{1.96}{\sqrt{n}}$ (for a 95% level) that you see on ACF plots.

For a general stationary process (not white noise), the variance of $\hat{\rho}(h)$ is more complex. Bartlett's formula gives:

$Var(\hat{\rho}(h)) \approx \frac{1}{n} \sum_{k=-\infty}^{\infty} \left[\rho(k)^2 + \rho(k+h)\rho(k-h) - 4\rho(h)\rho(k)\rho(h-k)\right]$

This is harder to use in practice because it depends on the unknown population ACF, but it's important to know that the simple $\pm 1.96/\sqrt{n}$ bands are only strictly valid under the white noise assumption.

Models for stationary processes

Several standard models capture different types of dependence in stationary time series. Choosing the right one depends on the ACF and PACF patterns you observe in the data.

Autoregressive (AR) models

An AR( $p$ ) model expresses the current value as a linear combination of its $p$ most recent values plus a white noise shock:

$X_t = \phi_1 X_{t-1} + \phi_2 X_{t-2} + \cdots + \phi_p X_{t-p} + \epsilon_t$

where $\epsilon_t \sim WN(0, \sigma^2)$ . The process is stationary if all roots of the characteristic polynomial $1 - \phi_1 z - \cdots - \phi_p z^p = 0$ lie outside the unit circle.

The ACF of an AR process decays gradually (exponentially or with damped oscillations), while the partial autocorrelation function (PACF) cuts off after lag $p$ . This PACF cutoff is how you identify the order of an AR model from data.

Strict vs weak stationarity, time series - Stationarity Tests in R, checking mean, variance and covariance - Cross Validated

Moving average (MA) models

An MA( $q$ ) model expresses the current value as a linear combination of the current and $q$ most recent white noise terms:

$X_t = \epsilon_t + \theta_1 \epsilon_{t-1} + \cdots + \theta_q \epsilon_{t-q}$

MA processes are always stationary (regardless of parameter values). Their ACF cuts off sharply after lag $q$ , which is the key identification tool. The PACF, by contrast, decays gradually.

MA models represent processes with "finite memory": a shock at time $t$ affects the series for exactly $q$ periods and then disappears.

Autoregressive moving average (ARMA) models

An ARMA( $p,q$ ) model combines both components:

$X_t = \phi_1 X_{t-1} + \cdots + \phi_p X_{t-p} + \epsilon_t + \theta_1 \epsilon_{t-1} + \cdots + \theta_q \epsilon_{t-q}$

ARMA models are flexible enough to represent a wide range of stationary processes, often with fewer parameters than a pure AR or MA model would require. Both the ACF and PACF of an ARMA process decay gradually, which is what distinguishes it from pure AR or pure MA behavior.

Model selection typically involves examining the sample ACF and PACF, then using information criteria (AIC, BIC) to compare candidate models.

Stationarity tests

Before fitting a stationary model, you need to verify that your data actually is stationary. There are both informal and formal approaches.

Visual inspection of time series

Start by plotting the series. A stationary series should show:

No visible trend (the mean doesn't drift up or down)
Roughly constant spread over time (no expanding or contracting variance)
No obvious structural breaks

This is a quick sanity check, not a definitive test. Visual inspection can miss subtle non-stationarity, and it can also be misleading with short series.

Augmented Dickey-Fuller (ADF) test

The ADF test is the most commonly used formal test for a unit root.

Null hypothesis: The series has a unit root (non-stationary)
Alternative hypothesis: The series is stationary

The test regression augments the basic Dickey-Fuller equation with lagged difference terms to account for serial correlation. You reject the null (conclude stationarity) if the test statistic is sufficiently negative, compared to the ADF critical values (which are non-standard and more negative than normal distribution critical values).

Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test

The KPSS test flips the hypotheses:

Null hypothesis: The series is stationary
Alternative hypothesis: The series has a unit root (non-stationary)

This reversal is useful because it lets you test stationarity from the opposite direction. A common strategy is to run both the ADF and KPSS tests together. If ADF rejects its null and KPSS fails to reject its null, you have strong evidence of stationarity. If they give conflicting results, further investigation (differencing, transformation) is needed.

Applications of stationary processes

Time series forecasting

Stationary models like AR, MA, and ARMA form the backbone of time series forecasting. Once you've confirmed stationarity and fitted an appropriate model, you can generate point forecasts and prediction intervals for future values. For actuaries, this applies directly to projecting claim frequencies, loss ratios, and economic variables used in reserve calculations.

Signal processing

Spectral analysis and filtering techniques assume stationarity to decompose a time series into frequency components. While this originates in engineering, actuaries encounter these methods when analyzing high-frequency financial data or extracting cyclical patterns from economic indicators.

Quality control

Control charts (Shewhart, CUSUM, EWMA) monitor a process over time and flag deviations from expected behavior. These charts assume the underlying process is stationary, so that deviations represent genuine changes rather than natural drift. Actuaries working in product reliability, warranty analysis, or operational risk may use these tools to detect anomalies.