Enter Data

Dependent variable values
Independent variable values
Second independent variable
Third independent variable
# Y X1 X2 X3

Quick Reference

Simple: ŷ = b0 + b1x
Multiple: ŷ = b0 + b1x1 + b2x2 + ...
b1 = Cov(x, y) / Var(x)  •  b0 = ȳ − b1
R² = 1 − SSR/SST  •  SER = √(SSR/(n−k−1))

Model Assumptions

For coefficient estimation (unbiasedness):

  • Linear in parameters
  • Random sampling (i.i.d.)
  • No perfect multicollinearity
  • Zero conditional mean: E(u|X) = 0

For standard errors and efficiency:

  • Homoskedasticity: Var(u|X) = σ²
  • Normality of errors (small-sample inference)

Standard errors shown are conventional (homoskedastic). For educational purposes only.

Ryan O'Connell, CFA
Built by Ryan O'Connell, CFA Econometrics & Finance Professional

Regression Results

ŷ = 2.2000 + 0.6000 · x1
Variable Coefficient Std. Error
Intercept 2.2000 0.9381
X1 0.6000 0.2828

Conventional (homoskedastic) standard errors

0.6000
Adj. R² 0.4667
SER 0.8944
Observations 5
R² = 0.6000: The model explains 60.00% of the variation in Y.
SST (Total) 6.00
SSE (Explained) 3.60
SSR (Residual) 2.40

Scatter Plot with OLS Fitted Line

Residual Plot

Residuals vs. fitted values. Look for patterns: a random scatter suggests the linear model is appropriate; a fan shape suggests heteroskedasticity.

Calculation Steps

Understanding OLS Regression

What is OLS Regression?

Ordinary Least Squares (OLS) is the most widely used method for estimating a linear regression model. It works by finding the line (or hyperplane, in multiple regression) that minimizes the sum of squared residuals — the vertical distances between each observed data point and the fitted line.

OLS Estimators (Simple Regression)
Slope: b1 = Cov(x, y) / Var(x) = Σ(xi − x̄)(yi − ȳ) / Σ(xi − x̄)²
Intercept: b0 = ȳ − b1 · x̄
The regression line always passes through the point (x̄, ȳ)

Simple vs. Multiple Regression

Simple Regression

One independent variable
y = b0 + b1x + u. Estimates the bivariate relationship between X and Y. Easy to visualize with a scatter plot and fitted line.

Multiple Regression

Two or more independent variables
y = b0 + b1x1 + b2x2 + ... + u. Controls for confounders, isolating each variable's ceteris paribus effect.

Interpreting Regression Output

A regression output table shows each variable's coefficient (estimated effect on Y per one-unit change in X, holding other variables constant) and standard error (precision of the estimate). Smaller standard errors indicate more precisely estimated coefficients.

  • R-squared (R²) measures the proportion of variation in Y explained by the model (0 to 1). A value of 0.60 means 60% of variation is explained. Note: a low R² does not necessarily mean the model is bad, particularly in cross-sectional data.
  • Adjusted R² penalizes for adding variables and is preferred when comparing models with different numbers of regressors.
  • SER (Standard Error of Regression) estimates the standard deviation of the error term, measuring average prediction error.
Ceteris Paribus: In multiple regression, each coefficient represents the partial effect of that variable holding all other regressors constant. This is the key advantage over running separate simple regressions.

OLS Assumptions (Gauss-Markov)

For OLS to produce unbiased, efficient estimates with valid standard errors, the following assumptions must hold (Wooldridge, Chapters 2–3):

  • MLR.1 — Linearity: The population model is linear in parameters
  • MLR.2 — Random Sampling: Observations are i.i.d. draws from the population
  • MLR.3 — No Perfect Multicollinearity: No regressor is an exact linear function of others
  • MLR.4 — Zero Conditional Mean: E(u|X) = 0 — omitted variables are not correlated with included regressors
  • MLR.5 — Homoskedasticity: Var(u|X) = σ² — error variance is constant (needed for valid conventional SEs)
  • MLR.6 — Normality: u|X ~ N(0, σ²) — only needed for exact small-sample t and F distributions
Important: The standard errors reported by this calculator assume homoskedasticity (MLR.5). If you suspect the error variance changes with X (heteroskedasticity), the conventional standard errors may be invalid and robust (heteroskedasticity-consistent) standard errors should be used instead.
Download This Calculator as an Excel Template Interactive model with editable formulas — customize, save, and share.
Get Excel Template

Frequently Asked Questions

OLS (Ordinary Least Squares) is a method for fitting a linear model to data by minimizing the sum of squared residuals — the differences between observed and predicted values. The fitted line passes through the data in a way that makes vertical errors as small as possible. OLS is the foundational estimation method in statistics and econometrics.

The slope coefficient (b1) represents the change in Y associated with a one-unit increase in X, holding all else constant. For example, if b1 = 0.6, a one-unit increase in X is associated with a 0.6-unit increase in Y on average. The sign indicates direction: positive means X and Y move together, negative means they move in opposite directions.

Simple regression uses one independent variable, while multiple regression uses two or more. Multiple regression controls for the effects of other variables — the coefficient on X1 represents the effect of X1 holding X2, X3 constant (ceteris paribus). This helps isolate the individual contribution of each variable to the dependent variable.

A regression output table displays each variable's coefficient (estimated effect) and standard error (precision of the estimate). The coefficient tells you the estimated relationship between that variable and Y. The standard error tells you how precisely the coefficient is estimated — smaller standard errors indicate more precise estimates.

R-squared (R²) measures the proportion of variation in Y explained by the model, ranging from 0 to 1. An R² of 0.60 means 60% of variation in Y is accounted for by the regressors. However, R² alone does not indicate whether the model is correctly specified or whether the relationships are causal. Adjusted R² penalizes for adding variables and is preferred for comparing models with different numbers of regressors.

For unbiased coefficient estimation, OLS assumes: (1) the true relationship is linear in parameters, (2) observations are randomly sampled, (3) no perfect multicollinearity among regressors, and (4) errors have zero conditional mean E(u|X) = 0. For valid standard errors and efficiency, OLS additionally requires: (5) homoskedasticity — constant error variance, and (6) normally distributed errors for small-sample inference. The standard errors reported by this calculator assume homoskedasticity.
Disclaimer

This calculator is for educational purposes only and assumes the classical linear model. Actual econometric analysis requires careful consideration of model specification, data quality, and assumption violations. Standard errors reported are conventional (homoskedastic) and may be invalid under heteroskedasticity. This tool should not be used as the sole basis for research conclusions or investment decisions.