Skip to content
Module 09 of 1255 min readIntermediate

Panel data — fixed and random effects

When fixed effects beat random effects, when neither saves you, and the within-transformation that demystifies the algebra.

75%

Listen along

Read “Panel data — fixed and random effects” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Panel data — repeated observations on the same units over time — is the gold standard for many empirical questions because it lets you control for time-invariant unit characteristics that you couldn't measure or include in a pure cross-section.

What panel data buys you

Suppose you're regressing wages on union membership in a cross-section. Workers who join unions differ from non-members in countless unobserved ways: family background, work ethic, industry-specific norms. Those unobservables are baked into the error term, biasing the union coefficient.

With panel data — the same worker observed before and after joining a union — you can subtract the worker's average across periods. Anything time-invariant (family background, intelligence, work ethic) cancels out. What remains is the within-worker change, identified off variation that no cross-section could exploit.

Fixed effects: the within transformation

Demean every variable within unit:

math
ỹᵢₜ = yᵢₜ − ȳᵢ
x̃ᵢₜ = xᵢₜ − x̄ᵢ

Then run OLS on the demeaned variables. The result is identical to including a dummy for every unit (by the Frisch-Waugh-Lovell theorem), but computationally cheaper for large panels with thousands of units.

Standard errors must be adjusted for the degrees of freedom consumed by the unit fixed effects (xtreg, fe in Stata; plm in R; PanelOLS in Python's linearmodels all do this automatically).

What FE can and cannot do

  • Can do: control for ALL time-invariant unit characteristics, observed or unobserved
  • Can do: identify effects of within-unit variation in regressors over time
  • Cannot do: estimate the effect of any time-invariant regressor (it's collinear with the unit dummy)
  • Cannot do: solve endogeneity that comes from time-varying unobservables (job-specific shocks correlated with both wages and union status)

Random effects

RE assumes the unit-specific term is drawn from a distribution uncorrelated with the regressors. If true, RE is more efficient than FE — it uses both within and between variation. If false, RE is biased and inconsistent.

math
yᵢₜ = α + β · xᵢₜ + αᵢ + uᵢₜ, where αᵢ is random and Cov(αᵢ, xᵢₜ) = 0

The Hausman test

Compare FE and RE estimates. If they differ significantly, the RE assumption fails — use FE. If they don't differ, RE is consistent and more efficient — use RE.

In applied work, default to FE

RE's identifying assumption — that unobserved unit effects are uncorrelated with regressors — is rarely defensible in causal contexts. Stick with FE unless you have a specific reason (RCT analysis with random clusters, hierarchical Bayesian framework). Reviewers will demand the Hausman test anyway.

Two-way fixed effects

Add time fixed effects to absorb unit-invariant time shocks (a recession, a national policy that hits everyone simultaneously, secular trends in the outcome):

math
yᵢₜ = αᵢ + λₜ + β · xᵢₜ + uᵢₜ

The standard panel specification in policy evaluation. Identification: variation in xᵢₜ that's both within-unit (over time) AND within-time (across units) — i.e. xᵢₜ deviating from both its unit mean AND its time mean. With staggered treatment, this introduces the negative-weights problem we covered in Module 8.

First differences vs FE

First differences:

math
Δyᵢₜ = β · Δxᵢₜ + Δuᵢₜ

Differences out the unit fixed effect. With T = 2, FD and FE give identical estimates. With T > 2, they differ — FE uses all within variation, FD uses only adjacent-period changes. FD is more robust if the error has a unit root; FE is more efficient if the error is iid.

Clustered standard errors

Always cluster at the unit level (or higher, if treatment varies at a higher level — say, state when units are workers within state). Need ~30+ clusters for asymptotic inference; otherwise use wild cluster bootstrap.

Within R²

Standard R² becomes uninformative once you include unit dummies — most variance is absorbed. Report within-R² (variance of demeaned y explained by demeaned x) and overall R² together. Within-R² is the right number for assessing fit of the within-unit relationship.

Exercise

You have 10 years of data on 1,000 SMEs. You want to estimate the effect of access-to-credit (a binary indicator that flips on for some firms in some years) on revenue growth. Write the FE regression you'd run and explain what variation identifies the coefficient.

Loading progress…
LeadAfrikPublic Economics Hub