Difference-in-differences — Econometrics Module 8

Difference-in-differences (DiD) is the workhorse of policy evaluation in economics. The intuition is simple: when a policy hits some units and not others, compare the before-after change in the treated group to the before-after change in a control group. The first difference subtracts unit-specific levels; the second subtracts shared time trends. What's left is the treatment effect — under one strong assumption.

The two-by-two case

Two groups (treated, control) and two periods (pre, post). Mean outcome in each cell:

math

DiD = (Y̅_treated,post − Y̅_treated,pre) − (Y̅_control,post − Y̅_control,pre)

The first parenthesis is what happened to the treated group. The second is what would have happened anyway, proxied by the control group. The difference between them is the causal effect.

The regression form

math

yᵢₜ = α + β · Treatedᵢ + γ · Postₜ + δ · (Treatedᵢ × Postₜ) + uᵢₜ

δ is the DiD coefficient — the causal estimate. β is the level difference between groups; γ is the time trend shared by both. The interaction picks up what's specific to the treated group in the post period.

The parallel-trends assumption

The crucial assumption: absent treatment, the treated and control groups would have followed the same trajectory. We can never directly observe this — the treated group did get treated. But we can defend the assumption by examining pre-treatment trends.

Plot the pre-trends

Always show outcomes for both groups in the periods before treatment. If they were trending in parallel, the assumption is plausible. If they diverged in the years leading up to the treatment, the design is broken — what you'll attribute to treatment is just trend continuation.

Event-study plots

The modern presentation. Estimate a coefficient for each time period relative to treatment (k = -3, -2, -1, 0, +1, +2, +3...), with k = -1 as the omitted baseline. Plot the coefficients with confidence intervals. A credible DiD shows:

Coefficients near zero for k < 0 (no pre-trend)
A clear jump at k = 0 (treatment takes effect)
Persistence or fade in k > 0 (the dynamics of the effect)

Pre-trends are non-negotiable

If the leading coefficients are non-zero and trending, your design fails. The fix is finding a different control group, restricting the sample, or using a more demanding identification (synthetic control, RDD).

Two-way fixed effects

Generalising beyond two periods, the estimating equation becomes:

math

yᵢₜ = αᵢ + λₜ + δ · Treatₘᵢₜ + uᵢₜ

αᵢ is a unit fixed effect (absorbs everything time-invariant about i). λₜ is a time fixed effect (absorbs everything that affects all units in period t). δ identifies off the within-unit change net of common time shocks.

The staggered-treatment problem

When treatment timing varies across units (different states adopting a policy in different years), the simple two-way fixed-effects regression decomposes into a weighted average of pairwise DiD comparisons — and some of those weights are NEGATIVE, biasing the estimate even when treatment effects are positive everywhere.

Goodman-Bacon (2021), de Chaisemartin & D'Haultfœuille (2020), Callaway & Sant'Anna (2021), Sun & Abraham (2021) all documented and corrected this. The modern toolkit:

did_imputation (Borusyak, Jaravel, Spiess 2024) — imputes counterfactuals for treated cells
csdid (Callaway-Sant'Anna) — group-time average treatment effects
stackedev — stacked event-study regressions, one cohort at a time
Pre-2018 staggered DiD papers may need re-examination

Standard errors in DiD

Bertrand, Duflo & Mullainathan (2004) showed that naive standard errors in DiD regressions are wildly understated when outcomes are serially correlated within units (almost always). Cluster at the unit level — usually the level of treatment variation (state, county, firm). Need ~30+ clusters; with fewer, use wild cluster bootstrap.

Concrete example: Card-Krueger 1994

New Jersey raised its minimum wage from $4.25 to $5.05 in April 1992. Eastern Pennsylvania didn't. Card and Krueger surveyed fast-food restaurants in both areas before and after. They found employment in NJ rose modestly relative to PA — directly contradicting the standard demand-curve prediction.

The paper kicked off a 30-year debate, but the methodology — a clean two-by-two DiD on a sharp policy change — became the template for empirical labour economics. The result has held up across replications and extensions.

Exercise

You're evaluating a SACCO loan-rate cap that took effect in 2022 in 5 of 12 counties. You want to estimate its effect on borrowing. Sketch (a) the regression specification, (b) the parallel-trends test, (c) the SE clustering you'd use.