The Durbin-Watson test detects autocorrelation in regression residuals, specifically whether errors in a time series regression are correlated with their own lagged values. When autocorrelation is present, your OLS standard errors become unreliable, which means hypothesis tests and confidence intervals can't be trusted. This section covers how the test works, how to calculate and interpret the statistic, and what to do when autocorrelation shows up.
Overview of Durbin-Watson test
The Durbin-Watson test checks whether the residuals from a regression are autocorrelated. In a well-specified OLS model, residuals should be independent of one another. If they're not, you've violated a key Gauss-Markov assumption, and your coefficient estimates, while still unbiased, will have incorrect standard errors.
The test was developed by James Durbin and Geoffrey Watson in 1950 and remains one of the most common diagnostic checks in time series econometrics.
Purpose of the test
Autocorrelation means the error term in one period is correlated with the error term in a previous period. Think of it this way: if your model consistently underpredicts for several periods in a row, then overpredicts for several periods, the residuals are following a pattern rather than bouncing randomly. That pattern is autocorrelation.
Why does this matter? OLS assumes errors are independent. When that assumption fails:
- Standard errors are typically underestimated, making t-statistics too large
- You'll reject null hypotheses too often (false positives)
- Coefficient estimates remain unbiased but are no longer efficient (no longer the best you can get)
Assumptions behind the test
The Durbin-Watson test is valid only under specific conditions:
- The regression model includes an intercept term
- The explanatory variables are non-stochastic (fixed in repeated sampling)
- The errors follow a first-order autoregressive process (AR(1)), meaning
- The model does not include a lagged dependent variable (e.g., ) as a regressor
If any of these conditions are violated, the test results may be misleading.
Test for autocorrelation
The Durbin-Watson test is built to detect first-order autocorrelation, which is the correlation between consecutive residuals. It can identify both positive and negative autocorrelation.
Positive vs negative autocorrelation
Positive autocorrelation means a positive residual in one period tends to be followed by another positive residual, and negative residuals tend to follow negative ones. If you plot the residuals over time, you'll see smooth, wave-like patterns. This is the more common type in economic time series data.
Negative autocorrelation means residuals tend to alternate in sign: a positive residual is likely followed by a negative one, and vice versa. The residual plot looks like a rapid zigzag pattern.
First-order autocorrelation
First-order autocorrelation is the correlation between and . The autoregressive parameter captures the strength and direction of this relationship:
- : positive autocorrelation
- : negative autocorrelation
- : no autocorrelation
The Durbin-Watson statistic is directly related to . Approximately, , which is why the statistic centers around 2 when there's no autocorrelation.
Higher-order autocorrelation
The Durbin-Watson test cannot detect higher-order autocorrelation, such as the correlation between and or . If you suspect autocorrelation at longer lags (common with quarterly or monthly data where seasonal patterns exist), use the Breusch-Godfrey LM test instead. The Breusch-Godfrey test is more flexible and also works when lagged dependent variables are in the model.
Calculating Durbin-Watson statistic
Formula for test statistic
The Durbin-Watson statistic is computed from the OLS residuals:
where is the residual at time and is the number of observations.
The numerator sums the squared differences between each residual and the one before it. The denominator is just the residual sum of squares. If consecutive residuals are similar to each other (positive autocorrelation), the numerator will be small relative to the denominator, pushing toward 0.

Range of possible values
The statistic always falls between 0 and 4:
- : No autocorrelation (residuals are independent)
- close to 0: Strong positive autocorrelation (consecutive residuals move together)
- close to 4: Strong negative autocorrelation (consecutive residuals alternate in sign)
A quick rule of thumb: if is between roughly 1.5 and 2.5, autocorrelation is probably not severe. But you should always check against the formal critical values.
Interpreting the test statistic
You can't just compare to a single critical value. The Durbin-Watson distribution depends on the specific data matrix , so exact critical values vary. Instead, Durbin and Watson established lower () and upper () bounds that apply regardless of the data configuration. This creates zones of rejection, non-rejection, and inconclusiveness.
Critical values for the test
Lower and upper bounds
The critical values and are found in published Durbin-Watson tables. They depend on three things:
- The significance level ()
- The number of observations ()
- The number of regressors (), excluding the intercept
For testing positive autocorrelation:
- If : reject (evidence of positive autocorrelation)
- If : do not reject
- If : the test is inconclusive
The inconclusive region is a real drawback of this test. With small samples or many regressors, this zone can be quite wide.
Significance level
The significance level is typically set at 0.05 (5%) or 0.01 (1%). A smaller makes the test more conservative, meaning you need stronger evidence to reject the null. Most Durbin-Watson tables provide bounds for both levels.
Number of regressors
As increases, the gap between and widens, making the inconclusive region larger. With many regressors and a small sample, the test becomes less useful because you're more likely to land in the inconclusive zone.
Testing procedure
Here's how to carry out the Durbin-Watson test step by step:
Null and alternative hypotheses
- : (no first-order autocorrelation)
- for a one-sided test of positive autocorrelation:
- for a one-sided test of negative autocorrelation:
Most applications test for positive autocorrelation first, since it's far more common in economic time series.

Rejection regions
For a two-sided test (checking for both positive and negative autocorrelation), the decision rules are:
-
: Reject , conclude positive autocorrelation
-
: Reject , conclude negative autocorrelation
-
: Do not reject , no evidence of autocorrelation
-
or : Test is inconclusive
Notice the symmetry around 2. The test for negative autocorrelation uses and as bounds.
Examples of test application
Suppose you estimate a consumption function with quarterly data (, ) and compute . At , the table gives and . Since , you reject and conclude there's positive autocorrelation in the residuals. Your next step would be to address it using one of the methods below.
Limitations of the test
Inconclusive regions
The inconclusive zone is the most frustrating aspect of the Durbin-Watson test. When falls between and , you can't make a definitive call. In practice, many researchers treat values in the inconclusive region as suggestive of autocorrelation and run additional tests (like Breusch-Godfrey) to confirm.
Lagged dependent variables
If your model includes a lagged dependent variable (e.g., ) on the right-hand side, the Durbin-Watson test is biased toward 2. This means it will tend to suggest no autocorrelation even when autocorrelation exists. For dynamic models, use the Durbin h-test or the Breusch-Godfrey test instead.
Misspecification of the model
A significant Durbin-Watson result doesn't always mean autocorrelation is the real problem. Omitted variables, incorrect functional form, or structural breaks can all produce patterns in the residuals that look like autocorrelation. Before applying a correction for autocorrelation, check whether your model specification is correct. Fixing the specification often resolves the apparent autocorrelation.
Addressing autocorrelation
If the Durbin-Watson test confirms autocorrelation, you have several options for producing reliable inference.
Generalized least squares
Generalized least squares (GLS) transforms the model to eliminate the autocorrelation structure. If you know , you can transform each observation:
Then run OLS on the transformed data. The resulting estimates are efficient (best linear unbiased). In practice, is usually unknown and must be estimated, which leads to Feasible GLS (FGLS).
Cochrane-Orcutt procedure
The Cochrane-Orcutt procedure is an iterative approach to FGLS:
- Run OLS on the original model and obtain residuals
- Regress on to estimate
- Transform the data using (as shown above)
- Run OLS on the transformed data
- Repeat steps 2-4 until converges (stops changing meaningfully)
One drawback: this procedure drops the first observation. The Prais-Winsten method is a variant that retains it, which matters more in small samples.
Newey-West standard errors
If your main concern is valid inference rather than efficiency, Newey-West (HAC) standard errors are a practical alternative. They adjust the standard errors to be robust to both autocorrelation and heteroskedasticity, without transforming the model.
- The coefficient estimates stay the same as OLS
- Only the standard errors (and therefore t-statistics and p-values) change
- You need to choose a bandwidth (maximum lag length), often set to roughly
Newey-West standard errors are widely used because they don't require you to specify the exact autocorrelation structure. They're especially common in applied time series work.