Simple linear regression is the foundation of econometric analysis. Whether you are estimating a stock’s sensitivity to market movements, testing an economic theory about the relationship between two variables, or building a forecasting model, understanding how ordinary least squares (OLS) works is essential. This guide covers the population model, OLS estimation, slope and intercept interpretation, R-squared as a measure of goodness of fit, the key assumptions required for OLS to be unbiased, and the Gauss-Markov theorem that establishes OLS as the best linear unbiased estimator.

What Is Simple Linear Regression?

Simple linear regression models the relationship between two variables: a dependent variable (Y) that you want to explain and a single independent variable (X) that you believe influences it. In finance, common applications include regressing a stock’s return on a market index return, a firm’s revenue on its advertising expenditure, or a bond’s yield on the prevailing interest rate. The population model is:

The Simple Linear Regression Model

Y = β0 + β1X + u

where β0 is the intercept, β1 is the slope, and u is the error term — the sum of all unobserved factors that affect Y beyond X.

The error term u is not a nuisance to be ignored. It captures everything the model leaves out: omitted variables, measurement imprecision, and inherently random variation. The population model describes the true data-generating process, which we never observe directly. Instead, we estimate β0 and β1 from sample data.

When the zero conditional mean assumption holds — E(u|X) = 0 — we can write the population regression function as E(Y|X) = β0 + β1X. Under this condition, a one-unit increase in X changes the expected value of Y by β1. Causal language requires that the unobserved factors in u do not move systematically with X. For a broader introduction to econometric methods and the role of causal inference, see our guide on what is econometrics.

The OLS Method: Finding the Best-Fitting Line

Ordinary least squares (OLS) chooses estimates β̂0 and β̂1 to minimize the sum of squared residuals (SSR) — the total squared distance between the observed values and the fitted line. Why squared? Squaring penalizes large misses more heavily than small ones and produces closed-form solutions that are easy to compute.

OLS Slope Estimate
β̂1 = Σ(xi − x̄)(yi − ȳ) / Σ(xi − x̄)²
The sample covariance of X and Y divided by the sample variance of X
OLS Intercept Estimate
β̂0 = ȳ − β̂1 × x̄
The sample mean of Y minus the slope times the sample mean of X

Once we have these estimates, we compute two key quantities for every observation:

  • Fitted value:i = β̂0 + β̂1xi — the model’s prediction for observation i
  • Residual:i = yi − ŷi — the difference between the actual and predicted value

An important distinction: the error term (u) is a population concept — the true unobserved deviation from the population regression function. The residual (û) is its sample counterpart — the deviation from the estimated regression line. We never observe u directly; we only observe û.

The residuals also allow us to estimate the error variance σ², which measures how dispersed the unobserved factors are around zero. Under the standard assumptions (introduced below), the following estimator is unbiased:

Error Variance Estimate
σ̂² = Σûi² / (n − 2)
Sum of squared residuals divided by degrees of freedom (n minus 2, because we estimate two parameters)

We divide by (n − 2) rather than n to correct for the downward bias that would result from using the same data to both estimate the coefficients and assess the error variance. The square root of σ̂² is the standard error of the regression (SER), which tells you the typical size of a residual in the units of Y.

Pro Tip

When the regression includes an intercept, OLS guarantees three algebraic properties: the residuals sum to zero (Σûi = 0), the residuals are sample-uncorrelated with X (Σxii = 0), and the fitted regression line passes through the point (x̄, ȳ) — the sample means of X and Y.

Interpreting the Slope and Intercept

The OLS slope β̂1 tells you how much the predicted value of Y changes for a one-unit increase in X. If you regress a stock’s monthly return on the market’s monthly return and obtain β̂1 = 1.25, then each additional percentage point of market return is associated with a 1.25 percentage point change in the stock’s return.

The intercept β̂0 is the predicted value of Y when X equals zero. Depending on the context, this may or may not have a meaningful interpretation. In a market-return regression, β̂0 represents the stock’s predicted return when the market return is zero — economically interesting but not always the focus of the analysis.

Changing the units of measurement affects the coefficients predictably. If you multiply X by a constant c, the slope is divided by c, but the intercept remains unchanged — the predicted value of Y when X = 0 does not depend on how X is scaled. If you rescale Y instead, both coefficients change proportionally. These are purely mechanical effects that do not alter the underlying relationship.

Quick Interpretation Example

Suppose you regress quarterly revenue (in millions of dollars) on advertising spending (in thousands of dollars) for a sample of 40 firms and obtain:

Revenuê = 2.1 + 0.045 × Advertising

  • Slope: Each additional $1,000 in advertising is associated with $0.045 million ($45,000) in additional revenue
  • Intercept: A firm spending zero on advertising has a predicted revenue of $2.1 million

If you re-express advertising in millions instead of thousands, the slope becomes 45.0 and the intercept stays 2.1. The relationship is identical — only the units changed.

Association Is Not Causation

The OLS slope measures a statistical association between X and Y. A causal interpretation — that changing X causes Y to change by β1 — requires the zero conditional mean assumption E(u|X) = 0 to hold. If omitted variables in the error term are correlated with X, the estimated slope conflates the effect of X with the influence of those omitted factors.

Goodness of Fit: R-Squared

R-squared (R²) measures how well the regression line fits the data. It quantifies the fraction of the sample variation in Y that is explained by X.

R-Squared
R² = 1 − SSR / SST
One minus the ratio of the residual sum of squares to the total sum of squares

Where:

  • SST (Total Sum of Squares) = Σ(yi − ȳ)² — total variation in Y around its mean
  • SSR (Residual Sum of Squares) = Σûi² — variation left unexplained by the model
  • ESS (Explained Sum of Squares) = SST − SSR — variation explained by the regression

R² always falls between 0 and 1. A value of 0.65 means the regression explains 65% of the sample variation in Y. In simple linear regression, R² equals the square of the sample correlation between X and Y. For a deeper treatment of correlation and covariance, see our guide on correlation and covariance.

R-Squared in Practice

An analyst regresses the monthly return of an energy ETF on crude oil price changes over 48 months and obtains R² = 0.72. This means 72% of the month-to-month variation in the ETF’s return is explained by oil price movements. The remaining 28% reflects other factors: natural gas prices, refining margins, individual company news, and broader market sentiment.

By contrast, regressing a diversified large-cap equity fund on the S&P 500 might yield R² = 0.97, because the fund closely tracks the index. R-squared is context-dependent — what counts as “high” or “low” depends on the variables and the question being asked.

Low R-Squared Does Not Mean a Bad Model

A model with R² = 0.30 can still have a highly statistically significant and economically meaningful slope. R-squared measures in-sample fit, not the importance of the relationship or the model’s predictive power on new data. In finance and economics, R-squared values of 0.10 to 0.40 are common and perfectly acceptable.

Simple Linear Regression Assumptions and OLS Properties

The desirable statistical properties of OLS depend on a set of assumptions about the population model and the data. Wooldridge labels these SLR.1 through SLR.5:

Assumption Statement Why It Matters
SLR.1 Linear in parameters: Y = β0 + β1X + u Defines the population model OLS is designed to estimate
SLR.2 Random sampling: {(xi, yi): i = 1, …, n} is a random sample Ensures each observation is drawn independently from the same population
SLR.3 Sample variation in X: the xi values are not all the same Without variation in X, the slope formula divides by zero
SLR.4 Zero conditional mean: E(u|X) = 0 The most critical assumption — required for unbiasedness and causal interpretation
SLR.5 Homoskedasticity: Var(u|X) = σ² Constant error variance — needed for the Gauss-Markov efficiency result

Note that SLR.1 says “linear in parameters” — the relationship between X and Y need not be literally a straight line. Models like Y = β0 + β1log(X) + u or Y = β0 + β1X² + u are still linear in β0 and β1, so OLS applies. What matters is that the parameters enter the equation linearly, not that X appears in raw form.

Under assumptions SLR.1 through SLR.4, the OLS estimators are unbiased: E(β̂1) = β1. This means that if you were to draw many random samples and estimate the slope each time, the average of those estimates would equal the true population slope. Unbiasedness is a property of the estimation procedure, not of any single estimate — any particular sample may yield a slope that is above or below the true value.

Gauss-Markov Theorem: OLS Is BLUE

Under assumptions SLR.1 through SLR.5, OLS is the Best Linear Unbiased Estimator (BLUE). “Best” means that among all linear estimators that are unbiased, OLS has the smallest variance. No other linear unbiased estimator can produce more precise estimates than OLS when these assumptions hold.

When assumptions are violated, the consequences depend on which assumption fails:

  • SLR.4 violated (zero conditional mean fails): OLS is biased — the slope systematically over- or underestimates β1. This is the most serious problem in applied work and typically arises from omitted variable bias.
  • SLR.5 violated (heteroskedasticity): OLS remains unbiased, but the usual standard errors are incorrect. Hypothesis tests and confidence intervals become unreliable. Heteroskedasticity-robust standard errors can fix this.
  • SLR.2 violated (non-random sampling): Sample selection issues can bias both the slope and intercept. For example, studying only profitable firms when profitability is related to the dependent variable.

For formal methods of testing whether the slope is statistically different from zero, see our guide on hypothesis testing in regression.

Pro Tip

The variance of the OLS slope estimator is Var(β̂1) = σ² / Σ(xi − x̄)². Two practical implications follow: (1) more data (larger n) increases the denominator and reduces variance, and (2) more spread in X values also reduces variance. When designing a study, choosing a sample with wide variation in X yields more precise slope estimates.

Finance Example: Estimating Market Sensitivity

Simple linear regression is widely used in finance to estimate how sensitive a stock’s returns are to overall market movements. Consider an analyst who collects 60 months of return data for Apple (AAPL) and the S&P 500 index, then runs the regression:

RAAPL,t = β̂0 + β̂1 × RS&P500,t

OLS Regression: Apple vs. S&P 500 Monthly Returns
Statistic Estimate Interpretation
Slope (β̂1) 1.25 A 1% increase in S&P 500 return is associated with a 1.25% increase in Apple’s return
Intercept (β̂0) 0.30% Apple’s predicted monthly return when the market return is zero
0.58 58% of Apple’s return variation is explained by market movements
n 60 Five years of monthly observations

The estimated slope of 1.25 tells us that Apple’s returns are about 25% more sensitive to market movements than the market benchmark (which by definition has a slope of 1.0). The R² of 0.58 indicates that market-wide forces account for more than half of Apple’s return variation, with the remaining 42% attributable to firm-specific factors captured in the residuals.

Finance practitioners refer to this estimated slope as the stock’s beta — a measure of systematic risk. This regression is an application of the market model — a time-series regression of realized stock returns on realized market returns. Finance theory, particularly the Capital Asset Pricing Model (CAPM), motivates this specification. Here, we use it purely as an illustration of OLS mechanics: how the slope, intercept, and R-squared are estimated and interpreted.

Pro Tip

The quality of your regression depends on the data behind it. Use at least 36 to 60 monthly observations for stable estimates. Shorter windows introduce noise; longer windows risk including periods where the company’s risk profile was fundamentally different (e.g., before a major acquisition or industry shift). Always check whether the residuals reveal obvious patterns that suggest the linear model is misspecified.

Simple vs. Multiple Regression

Simple regression uses a single independent variable. When the analysis requires controlling for additional factors, the model extends to multiple regression. Understanding the distinction helps you decide which approach fits your research question.

Simple Regression

  • One independent variable (X)
  • Slope = sample Cov(X,Y) / sample Var(X)
  • Captures total association between X and Y
  • Susceptible to omitted variable bias if relevant factors are excluded
  • Best for: bivariate relationships, preliminary analysis

Multiple Regression

  • Two or more independent variables
  • Each slope is a partial effect (holding other variables constant)
  • Controls for confounding factors
  • Reduces omitted variable bias when relevant controls are included
  • Best for: isolating individual effects, testing theories
When Simple Regression Falls Short

Suppose you regress a firm’s stock return on the market return and find a slope of 0.90. But the firm is in the energy sector, and oil prices also drive its returns. If oil price changes are correlated with the market return, the simple regression slope of 0.90 mixes the effect of the market with the effect of oil prices. A multiple regression that includes both the market return and oil price changes would separate these influences and produce a more accurate estimate of market sensitivity.

Simple regression gives unbiased estimates only if no omitted variable is both correlated with X and affects Y. When that condition is unlikely to hold, multiple regression provides a path to more credible estimates by explicitly controlling for additional factors.

Common Mistakes in Simple Regression

Simple linear regression is conceptually straightforward, but several common errors can lead to incorrect conclusions. Being aware of these pitfalls helps you interpret results more carefully:

1. Confusing the population model with the sample regression function. The population model Y = β0 + β1X + u includes the unobservable error term u and describes the true data-generating process. The sample regression function Ŷ = β̂0 + β̂1X produces fitted values from estimated coefficients. The two are conceptually different: the population model is what we are trying to learn about; the sample regression is our best estimate from available data.

2. Confusing the error term with the residual. The error term (u) is a theoretical population quantity that we never observe — it represents all unobserved factors affecting Y. The residual (û) is the observable sample analogue, calculated as the difference between actual and fitted values. Properties that hold for residuals (e.g., Σûi = 0) do not necessarily hold for errors.

3. Interpreting R-squared as prediction accuracy. R-squared measures the proportion of in-sample variation explained by the model, not how accurately the model predicts new data. A model with R² = 0.30 in finance is not “only 30% accurate” — it may still capture the most important economic relationship in the data.

4. Treating association as causation. The OLS slope measures a statistical association. Concluding that X causes changes in Y requires the zero conditional mean assumption E(u|X) = 0 to hold. When omitted factors in u are correlated with X, the slope confounds the effect of X with those omitted influences. For an introduction to causal reasoning in econometrics, see our guide on what is econometrics.

5. Ignoring omitted variable bias. In simple regression, any relevant variable that is excluded from the model and correlated with X will bias the slope estimate. For example, if you regress firm profitability on advertising spending without controlling for firm size, the advertising slope will partly reflect the influence of size. Multiple regression addresses this by adding control variables.

Frequently Asked Questions

Simple linear regression uses one independent variable to explain the dependent variable, while multiple regression uses two or more. In simple regression, the slope equals the sample covariance of X and Y divided by the sample variance of X and captures the total association between the two variables. In multiple regression, each slope is a partial effect that holds other variables constant, which helps control for confounding factors and reduces omitted variable bias.

R-squared measures the fraction of the sample variation in Y that is explained by the regression on X. In simple linear regression specifically, R-squared equals the square of the sample correlation between X and Y. An R-squared of 0.45 means 45% of the variation in Y is accounted for by the linear relationship with X, while 55% remains unexplained. A low R-squared does not mean the model is wrong — it simply means other factors also influence Y.

Not on its own. The OLS slope estimates a statistical association between X and Y. A causal interpretation — that changing X causes Y to change — requires the zero conditional mean assumption E(u|X) = 0 to hold, meaning no omitted variable in the error term is correlated with X. In observational data, this assumption is often difficult to justify without additional methods such as instrumental variables or controlled experiments.

The standard assumptions are: SLR.1 (linearity in parameters), SLR.2 (random sampling), SLR.3 (sample variation in the independent variable), and SLR.4 (zero conditional mean of the error term). These four assumptions guarantee that OLS is unbiased. Adding SLR.5 (homoskedasticity — constant error variance) gives the Gauss-Markov result: OLS is the Best Linear Unbiased Estimator (BLUE), meaning no other linear unbiased estimator has a smaller variance.

The error term (u) is a theoretical quantity in the population model Y = β0 + β1X + u. It represents all unobserved factors affecting Y and is never directly observable. The residual (û) is the sample counterpart, calculated as the difference between the observed value and the fitted value: ûi = yi − ŷi. Residuals are observable and are used to assess model fit, but their algebraic properties (such as summing to zero) do not necessarily hold for the true errors.

Economic theory or the research question determines the assignment. The dependent variable (Y) is the outcome you want to explain, and the independent variable (X) is the factor you believe influences it. For example, in a market model regression, the stock’s return is Y because it is the outcome of interest, and the market return is X because it is the explanatory factor. Reversing the assignment changes the slope estimate and alters the economic interpretation of the regression.

Disclaimer

This article is for educational and informational purposes only and does not constitute investment advice. The regression estimates used in examples are illustrative and may differ based on the data source, time period, and methodology. Always conduct your own analysis and consult a qualified financial advisor before making investment decisions. Reference: Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach, 8th Edition, Cengage, 2025.