OLS is BLUE — Best Linear Unbiased Estimator — under a specific set of assumptions. Knowing which assumptions matter for which conclusion is the difference between an analyst who can defend a regression and one who can't.
The Gauss-Markov assumptions
- Linearity in parameters: y = Xβ + u
- No perfect multicollinearity: X has full column rank
- Strict exogeneity: E[u | X] = 0
- Homoskedasticity: Var(uᵢ | X) = σ² for all i
- No autocorrelation: Cov(uᵢ, uⱼ | X) = 0 for i ≠ j
Under these five, OLS is the minimum-variance unbiased estimator. Add normality of u and OLS gives you exact small-sample t and F distributions for inference.
Which assumptions matter for what?
For unbiasedness: only exogeneity
If E[u | X] = 0, then E[β̂] = β. Heteroskedasticity, autocorrelation, and non-normality do NOT bias OLS. They affect the standard errors, which affect inference.
The single most important assumption
Strict exogeneity. If your regressor is correlated with anything in the error term, your coefficient is biased. No amount of robust standard errors fixes that. Endogeneity is Module 6 because it's the central issue in applied work.
For correct standard errors: homoskedasticity + no autocorrelation
Real-world data violates homoskedasticity routinely (variance grows with x in cross-sections; clusters within households). Use heteroskedasticity-robust SEs (HC0/HC1/HC3) by default — they require nothing beyond exogeneity for valid inference.
Autocorrelation matters in time series and panel data. Cluster-robust SEs handle within-group correlation. We cover this in Module 4.
For exact t-distribution inference: normality of u
With large samples, the central limit theorem gives you approximate normality for β̂ regardless of the distribution of u. So normality matters mostly in small samples (n < 30 or so). With n = 1,000, don't worry about it.
When OLS goes wrong
- Omitted variable: a variable correlated with both y and an included regressor — biases the included coefficient
- Reverse causation: y also affects x — Cov(x, u) ≠ 0
- Measurement error in x: classical errors-in-variables shrinks coefficient toward zero (attenuation bias)
- Selection: the sample is non-random in a way correlated with y — survival bias is the canonical case
Robust SEs don't fix bias
Robust standard errors are a valid choice almost always. But if your point estimate is biased, robust SEs just give you a tighter confidence interval around the wrong number. Identification first, inference second.
Exercise
You regress test scores on class size and find β̂ = -2 (smaller classes → higher scores). The SE is 0.3. List two ways this estimate could be biased, and the direction of each bias.