Heteroskedasticity: Detection, Consequences & Robust Standard Errors
When an analyst regresses stock returns on firm characteristics across hundreds of companies, the residuals often reveal a pattern: the unexplained variation differs systematically across firms, with smaller, less liquid companies showing far more residual dispersion than larger, more established ones. This non-constant spread of residuals is heteroskedasticity — non-constant error variance — and it is one of the most common violations of the classical regression assumptions. Heteroskedasticity does not bias your coefficient estimates, but it does undermine the standard errors that drive every t-test and confidence interval. This guide covers what heteroskedasticity is, why it matters for statistical inference, how to detect it with the Breusch-Pagan and White tests, and how to correct it using robust standard errors or weighted least squares.
What Is Heteroskedasticity?
In the classical linear regression model, one of the key assumptions (MLR.5) is homoskedasticity: the variance of the error term is the same for all values of the independent variables. When this assumption fails — when the error variance changes systematically with one or more regressors — the data exhibit heteroskedasticity.
Homoskedasticity (MLR.5): Var(u | x1, …, xk) = σ² — the error variance is constant across all observations. Heteroskedasticity: Var(u | x1, …, xk) varies with the values of the independent variables. In a residual plot, homoskedastic data shows a uniform band of residuals; heteroskedastic data shows a funnel or fan shape where the spread widens (or narrows) as fitted values change.
The term comes from Greek: hetero (different) + skedasis (dispersion). In finance, heteroskedasticity appears frequently in cross-sectional data. When modeling house prices, the prediction errors for $800,000 homes vary far more than those for $200,000 homes. In cross-sectional asset pricing, the residual variance of stock returns often differs systematically with firm size — smaller, less liquid firms tend to have more volatile unexplained returns than large-cap firms. These patterns arise because the economic forces that generate dispersion often scale with the level of the variable being studied: dollar-denominated variables like revenues, prices, and asset values naturally have larger absolute dispersion at higher levels.
Although heteroskedasticity can also appear in time-series and panel data, this article focuses on the cross-sectional case — variance that differs across observations at a point in time. It should not be confused with serial correlation, which involves dependence across time periods. Both cause problems for standard errors, but the diagnostic tests and corrections are distinct. For the full set of OLS assumptions that heteroskedasticity violates, see our guide on simple linear regression.
Consequences of Heteroskedasticity for OLS
The good news: under assumptions MLR.1 through MLR.4, OLS coefficient estimates remain unbiased and consistent regardless of whether heteroskedasticity is present. The slope and intercept are still centered on their true population values. R-squared and adjusted R-squared are also unaffected.
The bad news: OLS is no longer BLUE (Best Linear Unbiased Estimator). The Gauss-Markov theorem requires homoskedasticity, so when it fails, OLS loses its efficiency advantage. More critically, the usual formulas for standard errors are biased — they assume a common σ² that does not exist. The usual standard errors can be too small or too large, though in practice understatement (leading to over-rejection of null hypotheses) is the more common concern. The usual t, F, and LM statistics no longer follow their assumed distributions, so all hypothesis testing and confidence intervals become unreliable — not just the printed standard error column.
Do not confuse the effects of heteroskedasticity with those of omitted variable bias. Heteroskedasticity biases standard errors (and therefore t-statistics and p-values), but the OLS coefficient estimates remain unbiased and consistent. If your coefficients are biased, the problem lies elsewhere — most commonly in omitted variables or measurement error, not heteroskedasticity.
How to Detect Heteroskedasticity: Breusch-Pagan and White Tests
Before running formal tests, start with a residual plot. Plot the squared OLS residuals (û²) against each independent variable or against the fitted values (ŷ). If the spread of residuals systematically widens or narrows, heteroskedasticity is likely present. Visual inspection is informal but powerful — it can reveal patterns that formal tests miss and can help identify which variables are associated with the changing variance.
The Breusch-Pagan Test
The Breusch-Pagan (BP) test formalizes the visual inspection by testing whether the squared residuals are linearly related to the independent variables. The procedure is:
- Estimate the original model by OLS and obtain the squared residuals ûi²
- Regress ûi² on x1, x2, …, xk (the same independent variables from the original model)
- Compute the LM statistic from the R-squared of this auxiliary regression
- Compare to a chi-squared distribution with k degrees of freedom
If the p-value is small (below your chosen significance level), reject the null hypothesis of homoskedasticity. An equivalent F-test can also be used, which may perform better in smaller samples.
An analyst regresses house price on lot size, living area, and number of bedrooms for a cross-section of 88 home sales:
Pricê = −21,770 + 2.07 × LotSize + 123.5 × LivingArea + 13,850 × Bedrooms
Regressing the squared residuals on these three regressors yields R² = 0.16.
- LM = 88 × 0.16 = 14.09
- χ² critical value (3 df, 5% level) = 7.81
- Since 14.09 > 7.81, reject homoskedasticity
Interpretation: larger, more expensive homes have more variable pricing errors. Appraisals for $800,000 properties fluctuate more than for $200,000 properties — a classic case of variance increasing with the level of the dependent variable.
The White Test
The White test is more general than Breusch-Pagan because it does not assume a linear relationship between the squared residuals and the regressors. Instead, it regresses û² on the original x variables, their squares, and all cross-products, then tests whether they jointly explain the residual variance. The downside is that this auxiliary regression uses many degrees of freedom when the model has several regressors.
A practical simplification: regress û² on the fitted values ŷ and ŷ² only. This simplified version is a special case of the full White test — it has lower power against some forms of heteroskedasticity but uses just 2 degrees of freedom regardless of the number of original regressors, making it practical when the full specification would exhaust degrees of freedom.
The Breusch-Pagan and White tests can also reject because of functional-form misspecification in the original model (e.g., omitting a squared term or an interaction), not only because of true heteroskedasticity. A significant test result should prompt you to examine whether the model itself is correctly specified before concluding that heteroskedasticity is the sole issue.
Start with a residual plot before running formal tests. If you see a clear funnel pattern, heteroskedasticity is likely present regardless of what the Breusch-Pagan p-value says. Conversely, if the residual plot looks uniform, a marginally significant BP test may be driven by a few outliers rather than systematic heteroskedasticity.
Heteroskedasticity-Robust Standard Errors
Rather than assuming a specific form for the heteroskedasticity, we can compute standard errors that are valid whether or not homoskedasticity holds. These are called heteroskedasticity-robust standard errors — also known as White, Huber-White, or sandwich standard errors.
The key insight is to replace the constant σ² in the variance formula with the observation-specific squared residuals ûi², which serve as estimates of the individual error variances:
Several variants exist: HC0 is the original White estimator, HC1 applies a degrees-of-freedom correction (multiplying by n/(n − k − 1)), and HC3 uses a leverage-based adjustment that often performs best in small samples. The robust t-statistic uses the same formula as usual — it simply replaces the standard error in the denominator with the robust version.
An analyst regresses annual stock returns on firm characteristics for 200 companies. The OLS coefficients are identical under both approaches — only the standard errors change:
| Variable | Coefficient | Usual SE | Robust SE | Usual t | Robust t |
|---|---|---|---|---|---|
| log(Market Cap) | −0.032 | 0.011 | 0.018 | −2.91 | −1.78 |
| Book-to-Market | 0.085 | 0.025 | 0.029 | 3.40 | 2.93 |
| Leverage | −0.041 | 0.019 | 0.031 | −2.16 | −1.32 |
Notice that the robust standard errors are larger for every variable. Log(Market Cap) and Leverage lose statistical significance at the 5% level when robust SEs are used. The usual standard errors were too small, leading to false rejections — exactly the problem heteroskedasticity creates.
Many applied researchers now report robust standard errors by default in all cross-sectional regressions, regardless of whether a formal test rejects homoskedasticity. This avoids the pre-testing problem — where your inference depends on a preliminary test whose own outcome is uncertain. When robust and usual standard errors agree closely, it suggests homoskedasticity is not a major concern in your data.
For time-series data where errors may be both heteroskedastic and serially correlated, a different correction is needed: HAC (heteroskedasticity and autocorrelation consistent) standard errors, also known as Newey-West standard errors. For details, see our guide on serial correlation in time series. For volatility clustering and GARCH models in financial returns, see our guide on Value at Risk.
Weighted Least Squares (WLS)
When you know (or can estimate) how the error variance depends on the independent variables, weighted least squares offers a more efficient alternative to OLS with robust standard errors. WLS assumes that Var(u | x) = σ² × h(x), where h(x) is a known positive function. The transformation divides each observation by √h(xi), producing a model with homoskedastic errors:
Observations with higher error variance receive less weight, and observations with lower variance receive more. When the variance function is correctly specified, WLS is more efficient than OLS — it produces smaller standard errors and tighter confidence intervals.
Feasible GLS (FGLS)
In practice, h(x) is rarely known exactly. Feasible GLS estimates the variance function from the data. The most common approach assumes an exponential form: Var(u | x) = σ² exp(δ0 + δ1x1 + … + δkxk). The procedure is:
- Estimate the original model by OLS and obtain squared residuals ûi²
- Regress log(ûi²) on x1, …, xk and obtain fitted values ĝi
- Compute estimated weights: ĥi = exp(ĝi)
- Apply WLS using weights 1/ĥi
FGLS estimators are no longer unbiased (because the weights are estimated), but they are consistent and asymptotically more efficient than OLS. However, if the assumed variance function is wrong, WLS can produce estimates that are less efficient than simple OLS with robust standard errors. When in doubt about the functional form, robust standard errors are the safer choice.
Heteroskedasticity in the Linear Probability Model
When the dependent variable is binary (0 or 1), OLS produces the linear probability model (LPM). In the LPM, heteroskedasticity is not just possible — it is guaranteed whenever the slope coefficients are nonzero. The conditional variance of a binary Y is:
Since p(x) changes with x, the variance changes with x by construction. Heteroskedasticity-robust standard errors are therefore mandatory for valid inference in the LPM. WLS is theoretically possible but has a practical complication: some fitted probabilities may fall outside (0, 1), producing negative estimated variances. For a full treatment of the linear probability model and its alternatives (logit, probit), see our guide on the linear probability model.
Robust Standard Errors vs. Weighted Least Squares
The two main approaches to handling heteroskedasticity involve a fundamental trade-off between robustness and efficiency:
Robust Standard Errors
- Assumption: None about the form of heteroskedasticity
- What changes: Standard errors only — OLS coefficients unchanged
- Efficiency: Not fully efficient (uses OLS estimates, not optimal weights)
- Risk: Valid whether or not heteroskedasticity is present
- Sample size: Requires large samples for reliable performance
- Best for: Default choice when variance form is unknown
Weighted Least Squares
- Assumption: Var(u | x) = σ² × h(x), where h(x) is specified
- What changes: Both coefficients and standard errors (re-estimated)
- Efficiency: More efficient when variance function is correctly specified
- Risk: Misspecified h(x) can make estimates worse than OLS
- Sample size: Works in small samples when h(x) is correct
- Best for: Strong theoretical basis for the variance structure
Current best practice in applied econometrics: report robust standard errors by default in cross-sectional work. Use WLS only when you have strong theoretical or empirical reasons to specify the variance function — for example, when economic theory predicts that variance is proportional to a specific variable (such as house size in a pricing model or firm assets in a profitability regression). Note that the time-series analog of robust standard errors — HAC/Newey-West standard errors — follows the same philosophy of robustness over efficiency.
Common Mistakes
1. Believing heteroskedasticity biases OLS coefficients. This is the most widespread misconception. Heteroskedasticity biases standard errors, not coefficients. OLS slope estimates remain unbiased and consistent under MLR.1 through MLR.4. The problem is entirely about inference: wrong standard errors lead to wrong t-statistics, wrong p-values, and wrong confidence intervals. For problems that do bias coefficients, see omitted variable bias.
2. Ignoring obvious residual patterns. If a residual-versus-fitted-values plot clearly fans out, the usual OLS standard errors are unreliable. At minimum, report heteroskedasticity-robust standard errors. Ignoring the visual evidence and relying on usual SEs can lead to false rejections (Type I errors) — declaring a coefficient statistically significant when the evidence is actually too weak.
3. Over-correcting with WLS using an incorrect variance function. WLS requires specifying how the variance depends on the independent variables. If the assumed function is wrong, WLS can produce estimates that are less efficient than OLS with robust standard errors. Unless you have a strong theoretical basis for the variance structure, robust standard errors are the safer remedy.
4. Confusing heteroskedasticity with serial correlation. Both cause biased standard errors, but they arise from different data structures. Heteroskedasticity is a cross-sectional problem (variance differs across observations). Serial correlation is a time-series problem (errors are correlated across time periods). The diagnostic tests and corrections are distinct — using a Breusch-Pagan test on time-series data does not check for serial correlation, and a Durbin-Watson test on cross-sectional data does not check for heteroskedasticity.
5. Thinking robust standard errors fix omitted variable bias or functional-form problems. Robust SEs correct the standard errors for non-constant variance, but they do not address bias in the coefficients themselves. If important variables are omitted or the functional form is wrong, the coefficients are biased regardless of which standard errors you report. Robust standard errors make your inference about biased coefficients more honest, but they cannot make the coefficients unbiased.
Limitations of Heteroskedasticity Corrections
Robust standard errors have only asymptotic justification. In small samples (fewer than 50 observations), they can be unreliable — sometimes even further from the true standard errors than the usual OLS versions. The HC3 variant partially addresses this, but no heteroskedasticity correction substitutes for having an adequate sample size. Similarly, the Breusch-Pagan and White tests have low power in small samples and may fail to detect heteroskedasticity that is genuinely present.
Heteroskedasticity corrections fix the inference problem (wrong standard errors) but do not improve the efficiency of OLS unless you use WLS with a correctly specified variance function. If your primary concern is prediction accuracy rather than hypothesis testing, heteroskedasticity is less consequential — the OLS fitted values are the same regardless of whether you use robust or usual standard errors.
Frequently Asked Questions
Disclaimer
This article is for educational and informational purposes only and does not constitute investment advice. The regression estimates and test statistics used in examples are illustrative and may differ based on the data source, time period, and methodology. Always conduct your own analysis and consult a qualified financial advisor before making investment decisions. Reference: Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach, 8th Edition, Cengage, 2025.