Skip to content
Module 03 of 1260 min readIntermediate

OLS assumptions and what they buy you

Linearity, exogeneity, homoskedasticity, no autocorrelation, normality. Which ones matter for unbiasedness, which for inference, which can you relax.

25%

Listen along

Read “OLS assumptions and what they buy you” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

OLS is BLUE — Best Linear Unbiased Estimator — under a specific set of assumptions. Knowing which assumptions matter for which conclusion is the difference between an analyst who can defend a regression and one who can't.

The Gauss-Markov assumptions

  1. Linearity in parameters: y = Xβ + u
  2. No perfect multicollinearity: X has full column rank
  3. Strict exogeneity: E[u | X] = 0
  4. Homoskedasticity: Var(uᵢ | X) = σ² for all i
  5. No autocorrelation: Cov(uᵢ, uⱼ | X) = 0 for i ≠ j

Under these five, OLS is the minimum-variance unbiased estimator. Add normality of u and OLS gives you exact small-sample t and F distributions for inference.

Which assumptions matter for what?

For unbiasedness: only exogeneity

If E[u | X] = 0, then E[β̂] = β. Heteroskedasticity, autocorrelation, and non-normality do NOT bias OLS. They affect the standard errors, which affect inference.

The single most important assumption

Strict exogeneity. If your regressor is correlated with anything in the error term, your coefficient is biased. No amount of robust standard errors fixes that. Endogeneity is Module 6 because it's the central issue in applied work.

For correct standard errors: homoskedasticity + no autocorrelation

Real-world data violates homoskedasticity routinely (variance grows with x in cross-sections; clusters within households). Use heteroskedasticity-robust SEs (HC0/HC1/HC3) by default — they require nothing beyond exogeneity for valid inference.

Autocorrelation matters in time series and panel data. Cluster-robust SEs handle within-group correlation. We cover this in Module 4.

For exact t-distribution inference: normality of u

With large samples, the central limit theorem gives you approximate normality for β̂ regardless of the distribution of u. So normality matters mostly in small samples (n < 30 or so). With n = 1,000, don't worry about it.

When OLS goes wrong

  • Omitted variable: a variable correlated with both y and an included regressor — biases the included coefficient
  • Reverse causation: y also affects x — Cov(x, u) ≠ 0
  • Measurement error in x: classical errors-in-variables shrinks coefficient toward zero (attenuation bias)
  • Selection: the sample is non-random in a way correlated with y — survival bias is the canonical case

Robust SEs don't fix bias

Robust standard errors are a valid choice almost always. But if your point estimate is biased, robust SEs just give you a tighter confidence interval around the wrong number. Identification first, inference second.

Exercise

You regress test scores on class size and find β̂ = -2 (smaller classes → higher scores). The SE is 0.3. List two ways this estimate could be biased, and the direction of each bias.

Loading progress…
LeadAfrikPublic Economics Hub