Skip to content
Module 09 of 1260 min readIntermediate

Panel data: xtset, xtreg, fixed and random effects

xtset, xtreg fe, xtreg re, the Hausman test, clustered standard errors, and the within transformation that demystifies fixed effects.

75%

Listen along

Read “Panel data: xtset, xtreg, fixed and random effects” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Learning objectives

By the end of this module, you should be able to:

  • 01Declare a panel structure with xtset and inspect with xtdescribe and xtsum
  • 02Fit fixed-effects and random-effects panel models with xtreg
  • 03Explain the within-transformation that powers fixed effects and what it removes from the model
  • 04Run the Hausman test to compare FE and RE specifications

Panel data — repeated observations on the same units (firms, countries, individuals) over time — is where Stata is at its best. xtset declares the panel structure; xtreg fits panel models; the within-transformation that powers fixed effects is one command and one option.

xtset — declare the panel

stata
xtset bank_id year // panel ID + time variable
xtdescribe // panel structure: balanced or not
xtsum lending_rate // within / between / overall variance

xtreg with fixed effects

stata
xtreg lending_rate deposit_rate, fe // bank fixed effects
xtreg lending_rate deposit_rate i.year, fe // bank FE + year FE
xtreg lending_rate deposit_rate, fe vce(cluster bank_id) // clustered SEs

What the within transformation actually does

Fixed effects are equivalent to subtracting the bank-specific mean from each variable, then running OLS on the de-meaned data. This 'within' estimator removes any time-invariant bank characteristics — they get differenced out. The cost: any time-invariant predictor is unidentified (you can't include 'bank size at IPO' as a regressor when it never varies within bank).

Random effects

stata
xtreg lending_rate deposit_rate, re // random effects
hausman fe re // Hausman test: FE vs RE

Random effects assumes the bank effect is uncorrelated with the regressors. The Hausman test compares FE and RE estimates. If they differ significantly, the random-effects assumption is rejected — use FE.

reghdfe — high-dimensional fixed effects

stata
* User-written, fast, recommended for multi-way FE
ssc install reghdfe
reghdfe lending_rate deposit_rate, absorb(bank_id year) cluster(bank_id)

Time-series operators

stata
xtset bank_id year
generate lag_rate = L.lending_rate // L. lag
generate lead_rate = F.lending_rate // F. lead
generate diff_rate = D.lending_rate // D. first difference

Two-way clustering needs a special command

xtreg only does one-way clustering. For two-way (e.g., cluster by both bank and year), use reghdfe or ivreg2 with cluster(bank year). The choice can change the SEs by 30-50%.

Exercise

Declare a panel with bank_id and year, then fit a fixed-effects model of lending_rate on deposit_rate.

Key takeaways

  • xtset id_var time_var. Then xtreg y x, fe for fixed effects with cluster(id) for SEs
  • Fixed effects subtract the unit mean from each variable — time-invariant predictors get differenced out
  • The Hausman test compares FE vs RE. Reject H0 → use FE (RE inconsistent under correlated unit effects)
  • For two-way FE (or many-way), use reghdfe from SSC — much faster and supports clustering on multiple dimensions

Further reading

  1. 01

    Econometric Analysis of Cross Section and Panel Data (2nd Edition)

    Jeffrey M. Wooldridge · MIT Press · 2010The graduate-level reference for panel methods.

  2. 02

    Microeconometrics Using Stata, Chapter 9: Linear Panel-Data Models

    A. Colin Cameron & Pravin K. Trivedi · Stata Press · 2010

  3. 03
Loading progress…
LeadAfrikPublic Economics Hub