Serial Correlation: Durbin-Watson Test, HAC Standard Errors & FGLS
Serial correlation in regression is one of the most common — and most overlooked — problems in time series econometrics. When regression errors are correlated across time periods, standard OLS inference breaks down: standard errors become unreliable, t-statistics are inflated, and researchers risk drawing false conclusions. Whether you’re modeling bond yield dynamics, forecasting interest rates, or analyzing stock return predictability, understanding how to detect and correct serial correlation is essential for valid statistical inference.
What Is Serial Correlation?
Serial correlation (also called autocorrelation) occurs when the error terms in a regression model are correlated across time periods. In a properly specified model with independent errors, knowing today’s error tells you nothing about tomorrow’s. With serial correlation, that independence breaks down — errors exhibit a pattern over time.
Serial correlation means that Corr(ut, us) ≠ 0 for t ≠ s. In plain terms, today’s regression error is correlated with yesterday’s error — the errors are not independent across time.
The most common form is first-order serial correlation, modeled as an AR(1) process:
Positive serial correlation (ρ > 0) is by far the most common in financial data. A positive error today tends to be followed by a positive error tomorrow — residuals cluster in runs above and below zero. This arises naturally because economic conditions like interest rate regimes, credit cycles, and business cycle phases persist across periods.
Negative serial correlation (ρ < 0) is less common but can occur in models with overcorrection dynamics, where a positive error today is followed by a negative error tomorrow.
Serial correlation typically arises from omitted slowly-changing variables, model misspecification (such as omitting a time trend or seasonal component), or inherent inertia in economic data. Interest rate series, corporate earnings growth, and credit spreads are classic examples of data that generate serially correlated regression errors.
Consequences of Serial Correlation for OLS
Under standard assumptions (strictly exogenous regressors in a static or finite distributed lag model), serial correlation does not bias OLS coefficient estimates — they remain unbiased and consistent. The problem is entirely with inference: standard errors, t-statistics, confidence intervals, and hypothesis tests become unreliable.
When positive serial correlation is present in time series regression, four problems emerge:
- Standard errors are biased downward — OLS assumes each observation provides independent information. With positive serial correlation, consecutive observations carry overlapping information, so the effective sample size is smaller than OLS assumes. The result: standard errors that are too small.
- t-statistics are inflated — Because standard errors are too small, the corresponding t-statistics are too large. Researchers may find “statistically significant” relationships that are actually noise.
- Confidence intervals are too narrow — The understated standard errors produce confidence intervals that fail to achieve their nominal coverage level (e.g., a “95%” interval may actually cover the true parameter only 80% of the time).
- Model misspecification often accompanies serial correlation — While R-squared itself remains a consistent measure under stationarity, serial correlation is frequently a symptom of a misspecified model (missing lags, trends, or structural breaks), and that misspecification can inflate the apparent goodness of fit.
Under serial correlation, OLS is no longer BLUE (Best Linear Unbiased Estimator) — more efficient estimators exist that exploit the error structure.
If your time series regression shows suspiciously high t-statistics and a very smooth residual plot (residuals staying positive or negative for extended stretches), serial correlation is a likely culprit. Always test before trusting your results.
Serial correlation is often a diagnostic signal that your model is misspecified — it may be missing important lags, a time trend, or seasonal controls. Before applying HAC standard errors or FGLS, consider whether adding omitted variables or restructuring the model eliminates the serial correlation. Sometimes the right fix is better model specification, not a statistical correction. First differencing — subtracting each variable’s previous-period value — is a separate approach used when unit roots are suspected, addressing non-stationarity rather than AR(1) error dependence.
Testing for Serial Correlation
The Durbin-Watson Test
The Durbin-Watson (DW) test is the classic diagnostic for first-order serial correlation. It uses the OLS residuals to compute a test statistic that indicates whether consecutive errors are correlated:
The DW statistic has a simple relationship to the estimated serial correlation coefficient:
| DW Value | Implied ρ̂ | Interpretation |
|---|---|---|
| ≈ 2.0 | ≈ 0 | No serial correlation |
| < 2.0 | > 0 | Positive serial correlation (most common) |
| > 2.0 | < 0 | Negative serial correlation |
| ≈ 0 | ≈ 1 | Strong positive serial correlation |
| ≈ 4.0 | ≈ −1 | Strong negative serial correlation |
The DW test uses critical value bounds (dL and dU) that depend on the sample size and number of regressors. For testing positive serial correlation: if DW < dL, reject the null; if DW > dU, fail to reject; values between dL and dU are inconclusive. For negative serial correlation, apply the same bounds to (4 − DW). The inconclusive region is a notable disadvantage — the Breusch-Godfrey test avoids this problem entirely.
A researcher regresses monthly changes in the 10-Year Treasury yield on the Federal Funds rate and CPI inflation (2000–2024, T = 300 months). The OLS regression produces a DW statistic of 0.87.
Using the DW–rho relationship: ρ̂ ≈ 1 − DW/2 = 1 − 0.87/2 = 0.565. With critical values dL = 1.72 and dU = 1.76 at the 5% level, DW = 0.87 is well below dL — strong evidence of positive serial correlation. The OLS standard errors from this regression cannot be trusted.
The Breusch-Godfrey LM Test
The Breusch-Godfrey (BG) test is more general than the Durbin-Watson test and is the preferred diagnostic in modern applied work. Its key advantages:
- Valid when the model includes lagged dependent variables (where DW is biased)
- Can test for higher-order serial correlation (e.g., AR(2), AR(4) for quarterly seasonality)
- Produces a clear reject/fail-to-reject decision (no inconclusive region)
The procedure is straightforward: regress the OLS residuals on the original regressors plus q lagged residuals, then compute the LM statistic as LM = n × R2aux (where n is the number of observations in the auxiliary regression), which follows a χ2(q) distribution under the null hypothesis of no serial correlation up to order q.
t-Test for AR(1) Residuals
The simplest serial correlation test regresses the OLS residuals on their own first lag: ût = α + ρût−1 + error. The t-statistic on ρ̂ tests H0: ρ = 0. This simple version is valid when all regressors are strictly exogenous. When the model includes lagged dependent variables or other non-strictly-exogenous regressors, you must include all original regressors in the auxiliary regression alongside ût−1 to obtain a valid test — this is Wooldridge’s general version (Eq. 12.24), which is equivalent to the Breusch-Godfrey test with q = 1.
Serial Correlation-Robust Inference: Newey-West Standard Errors
Rather than correcting for serial correlation (which requires assuming a specific error structure), you can compute standard errors that are asymptotically valid regardless of the error structure. Newey-West standard errors — also called HAC (Heteroskedasticity and Autocorrelation Consistent) standard errors — achieve this by adjusting the variance estimate to account for error dependence.
The bandwidth (g) determines how many lagged autocovariances to include. In practice, bandwidth values are typically small — Wooldridge notes that values of g = 1 or 2 are common for annual data, with moderately larger values for quarterly or monthly data. Most statistical software packages (Stata, R, EViews) compute a default bandwidth automatically based on the sample size. It is good practice to check sensitivity by trying a few different values — if your conclusions change substantially with different bandwidths, the results may not be robust.
Newey-West standard errors are robust to both serial correlation and heteroskedasticity — the time series analog of White’s robust standard errors used in cross-sectional analysis. They use the original OLS coefficient estimates, so no re-estimation is required.
Returning to the Treasury yield regression (T = 300, DW = 0.87), the OLS t-statistic on the Federal Funds rate coefficient is 4.81 — highly significant. After computing Newey-West standard errors with bandwidth g = 2, the t-statistic drops to 2.14.
The coefficient is still statistically significant at the 5% level, but far less extreme than OLS suggested. Without the HAC correction, a researcher might have overstated the precision of this estimate by more than a factor of two.
Correcting for Serial Correlation: FGLS
When you are confident that the errors follow an AR(1) process and all regressors are strictly exogenous, you can go beyond adjusting standard errors and actually transform the data to eliminate the serial correlation. This approach — called Feasible Generalized Least Squares (FGLS) — can produce more efficient estimates than OLS. FGLS is not valid when the model contains lagged dependent variables or other non-strictly-exogenous regressors — in those cases, use Newey-West (HAC) standard errors instead.
The core idea is quasi-differencing: subtracting ρ times the lagged value from each variable removes the serial correlation from the errors:
Since ρ is unknown, FGLS uses the estimated ρ̂ from the residual regression (Step 1 of the t-test above). The procedure:
- Run OLS on the original model and obtain residuals ût
- Regress ût on ût−1 to estimate ρ̂
- Quasi-difference all variables using ρ̂
- Run OLS on the transformed data to obtain FGLS estimates
Two standard implementations exist:
| Method | First Observation | Key Feature |
|---|---|---|
| Cochrane-Orcutt | Dropped | Simpler; iterates until ρ̂ converges |
| Prais-Winsten | Retained (weighted by √(1 − ρ̂2)) | More efficient in small samples; preserves information |
A quarterly regression of BBB corporate bond spreads on GDP growth and the VIX index (2005–2024, T = 80) produces ρ̂ = 0.73, indicating strong positive serial correlation. Credit spreads are slow-moving and persistent — a textbook case for FGLS correction.
After applying Prais-Winsten estimation, the standard error on the VIX coefficient increases from 0.041 (OLS) to 0.068 (FGLS), and the coefficient on GDP growth loses statistical significance at the 5% level — a result masked by the artificially small OLS standard errors.
HAC Standard Errors vs FGLS
When serial correlation is detected, researchers face a choice between two correction strategies. The right choice depends on how confident you are in the error structure:
HAC Standard Errors (Newey-West)
- Does not require specifying the error process
- Asymptotically valid under any form of serial correlation (and heteroskedasticity)
- Changes standard errors only — uses original OLS coefficient estimates, no re-estimation
- Less efficient than FGLS when AR(1) is correctly specified
- Current best practice in published finance research
- Best for: robustness when the error structure is unknown
FGLS (Cochrane-Orcutt / Prais-Winsten)
- Requires specifying the error process (typically AR(1)) and strictly exogenous regressors
- More efficient than OLS+HAC when the AR(1) model is correct
- Produces different coefficient estimates (not just different SEs)
- Can be biased if the error process is misspecified
- R-squared not directly comparable to OLS (different dependent variable)
- Best for: efficiency when you are confident in the AR(1) structure
In current applied finance research, HAC standard errors are the default recommendation. They sacrifice some efficiency for robustness — a trade-off most researchers are willing to make, given the difficulty of verifying the exact error structure. FGLS remains valuable when the AR(1) model is well-supported and efficiency is a priority (e.g., small samples where every degree of precision matters).
Serial correlation refers to dependence in the level of the errors — today’s error predicts tomorrow’s error. ARCH (Autoregressive Conditional Heteroskedasticity) and GARCH models address a different problem: the conditional variance of the error changes over time (volatility clustering). A bond return series might exhibit both — serially correlated errors AND time-varying volatility. When ARCH effects are present, standard FGLS (which assumes constant variance in the transformed model) may be insufficient. For modeling time-varying volatility in financial returns, see our GARCH Volatility Calculator. For unit root testing and cointegration analysis, which addresses non-stationarity rather than error dependence, see our dedicated guide.
Common Mistakes
1. Using the Durbin-Watson test with lagged dependent variables. The DW test is biased toward 2 (toward finding no serial correlation) when the regression includes lagged values of the dependent variable as regressors. This means the test has low power precisely when serial correlation is most dangerous. Use the Breusch-Godfrey LM test instead, which remains valid with lagged dependent variables.
2. Confusing serial correlation with trending data. A trending time series can produce residuals that appear serially correlated even when the true errors are independent. If you regress a corporate earnings series on a macroeconomic variable without including a time trend, the residuals will cluster in runs simply because both variables trend over time. The solution is proper model specification — include time trends or detrend the data before testing for serial correlation.
3. Applying HAC or FGLS without first checking model specification. Serial correlation is often a symptom of a misspecified model — missing lags, omitted trends, or absent seasonal controls. Jumping straight to HAC standard errors or FGLS treats the symptom without addressing the cause. Always check whether adding omitted variables or restructuring the model eliminates the serial correlation before resorting to statistical corrections.
4. Assuming serial correlation biases OLS coefficient estimates. This is a common misconception. Under standard assumptions (strictly exogenous regressors), serial correlation does not bias OLS estimates — they remain unbiased and consistent. The problem is entirely with inference: standard errors, t-statistics, and confidence intervals are unreliable. You can trust the point estimates; you just need to fix the standard errors (or the model).
Frequently Asked Questions
Disclaimer
This article is for educational and informational purposes only and does not constitute investment or financial advice. The numerical examples and test statistics presented are illustrative and based on stylized scenarios. Serial correlation diagnostics and corrections should be applied with careful consideration of the specific data and model context. Always consult relevant econometrics references and qualified professionals for research-grade analysis.