Enter Values
IV Estimation Formulas
z = instrument | x = endogenous variable | y = dependent variable
Estimation Results
Diagnostic Tests
Coefficient Comparison
First-Stage F-statistic
Step-by-Step Calculation
Model Assumptions
Assumptions
- Random sampling / i.i.d.: Observations are independently and identically distributed
- Instrument relevance: Cov(z, x) ≠ 0 — tested by the first-stage partial F-statistic on excluded instruments
- Instrument exogeneity: Cov(z, u) = 0 — untestable with exact identification (q = 1); testable via Sargan J-test when overidentified (q > 1)
- Exclusion restriction: The instrument z affects y only through the endogenous variable x
- Homoskedasticity: Standard 2SLS standard errors assume homoskedastic errors; robust SEs require separate computation
Properties
- LATE: Under heterogeneous treatment effects, IV estimates the local average treatment effect (LATE) for compliers, not the population average
- Efficiency trade-off: 2SLS SEs are typically larger than OLS; if no endogeneity is present, OLS is more efficient
Understanding Instrumental Variables & 2SLS
What is Endogeneity?
Endogeneity occurs when an explanatory variable in a regression model is correlated with the error term, violating a key OLS assumption. This leads to biased and inconsistent OLS estimates. Common causes include omitted variable bias, simultaneity, and measurement error.
How Does IV/2SLS Solve This?
Instrumental variables (IV) estimation addresses endogeneity by finding a variable (the instrument) that is correlated with the endogenous regressor but uncorrelated with the error term. For the simple just-identified case with one endogenous variable, one instrument, and no additional controls, the IV estimator is b̂IV = Cov(z,y) / Cov(z,x). With controls or multiple instruments, two-stage least squares (2SLS) generalizes this in two steps:
Stage 1: First Stage
Regress the endogenous variable (x) on all instruments (z) and all included exogenous controls (w). Save the fitted values x̂.
Stage 2: Second Stage
Replace the endogenous x with x̂ in the structural equation (keeping the same exogenous controls) and estimate by OLS. Note: the standard errors must be adjusted — naive OLS SEs from this second-stage regression are incorrect.
When to Use OLS vs IV
The choice depends on two diagnostic tests:
- Weak instrument test (F > 10): If the first-stage F-statistic is below 10, instruments are weak and IV estimates are unreliable. Find stronger instruments.
- DWH test: If the DWH test rejects, there is evidence of endogeneity and IV/2SLS is preferred. If it fails to reject, OLS is preferred on efficiency grounds, though failure to reject is not proof of exogeneity.
Related Topics
- Instrumental Variables & 2SLS — Full Article
- Supply and Demand — a classic example where simultaneity creates endogeneity
- Difference-in-Differences Calculator
- Time Series Forecasting Calculator
Frequently Asked Questions
Disclaimer
This calculator is for educational purposes only and assumes standard 2SLS with homoskedastic errors. Actual IV estimation may require robust standard errors, additional diagnostic tests, and careful consideration of instrument validity. Results should be verified with statistical software. This tool should not be used as the sole basis for empirical research conclusions.