Regression with Time Series Data
Standard regression assumes that each observation is independent of the others. Time series data breaks that assumption because observations are ordered in time, and nearby values tend to be correlated. This creates problems you won't encounter with cross-sectional data, so time series regression requires its own set of tools and checks.
Challenges in Time Series Regression
Autocorrelation is the core issue. When today's value is correlated with yesterday's value (and the day before that), your observations aren't independent. If you ignore autocorrelation and run ordinary regression anyway, your coefficient estimates may still be unbiased, but the standard errors will be wrong. That means your confidence intervals and hypothesis tests can't be trusted.
Non-stationarity is the other major challenge. A stationary series has a constant mean, variance, and autocovariance over time. Many real-world series aren't stationary: stock prices trend upward, temperatures follow seasonal cycles, and so on. Regressing one non-stationary series on another can produce spurious regression, where you find a strong statistical relationship between variables that have no real connection. The classic example is two unrelated series that both happen to trend upward over time.
Seasonality and trends also need explicit treatment. If your data has a recurring monthly pattern or a long-run upward drift and you don't account for it, those patterns get absorbed into your error term. The result is biased coefficients and poor predictions.

Components of Time Series Models
A time series regression model typically accounts for three types of structure:
- Trend component: Captures the long-term direction of the series. A simple approach is to include a time index as a predictor (linear trend). For curves, you might use (quadratic) or a log transformation. The choice depends on what the data actually looks like when you plot it.
- Seasonality component: Represents repeating patterns at fixed intervals (monthly, quarterly, etc.). You can model this with dummy variables (one for each season minus one) or with Fourier terms (pairs of sine and cosine functions). Fourier terms are especially useful when the seasonal pattern is smooth rather than sharply different across periods.
- Exogenous variables: External factors that influence your series but aren't part of its own past behavior. Examples include GDP growth affecting retail sales, a policy change shifting energy consumption, or a marketing campaign boosting website traffic. These can be time-varying or constant (like a binary indicator for "before vs. after" a one-time event).

Techniques for Non-Stationarity and Autocorrelation
Differencing removes non-stationarity in the mean. First-order differencing creates a new series:
Each value becomes the change from the previous period rather than the level. If the series still isn't stationary after first differencing, you can difference again (second-order differencing). For seasonal non-stationarity, you'd use a seasonal difference: , where is the seasonal period (e.g., 12 for monthly data).
Detrending is an alternative to differencing when the non-stationarity comes from a deterministic trend. You estimate the trend (say, by regressing on ), then subtract it from the original series. The residuals from that regression become your detrended series, which you can then model as stationary.
Differencing vs. detrending: Use differencing when the trend is stochastic (random-walk-like behavior). Use detrending when the trend is deterministic (a predictable function of time). Choosing the wrong one can distort your results.
Lagged dependent variables address autocorrelation directly. By including past values of as predictors (e.g., , ), you let the model capture the temporal dependence. These are called autoregressive terms. The number of lags to include is a modeling decision, often guided by the autocorrelation function (ACF) and partial autocorrelation function (PACF) plots.
Model Evaluation and Prediction
Evaluation of Time Series Models
Residual analysis is your first check after fitting a model. For the model to be adequate, the residuals should behave like white noise:
- No autocorrelation (check with the ACF plot of residuals)
- Constant variance over time (homoscedasticity)
- Approximately normal distribution
The Durbin-Watson test specifically tests for first-order autocorrelation in residuals. The test statistic ranges from 0 to 4, with values near 2 indicating no autocorrelation. Values well below 2 suggest positive autocorrelation; values well above 2 suggest negative autocorrelation. If your residuals are autocorrelated, the model is missing some structure in the data.
Information criteria help you choose between competing models. Both AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) balance goodness-of-fit against model complexity. Lower values are better. BIC penalizes extra parameters more heavily than AIC, so it tends to favor simpler models.
Out-of-sample forecasting tests whether your model actually predicts well on data it hasn't seen:
- Split your data into a training set (earlier observations) and a test set (later observations). Don't shuffle randomly; the time order matters.
- Fit the model on the training set only.
- Generate forecasts for the test set period.
- Compare forecasts to actual values using accuracy metrics like RMSE (root mean squared error), MAE (mean absolute error), or MAPE (mean absolute percentage error).
Rolling window cross-validation takes this further. Instead of a single train/test split, you repeatedly move the training window forward in time, forecast the next period, and record the error. This gives you a more robust picture of how the model performs across different time periods, not just one arbitrary split point.