Every policy question — did the cash transfer reduce poverty, did the training raise wages, did the textbook improve learning — is a question about cause and effect. And causal questions are far harder than they look, because of one inescapable problem this module sets out. Getting this problem clear is the foundation of the entire course: everything that follows is a strategy for solving it.
The counterfactual
The fundamental problem of causal inference
To know the effect of a policy on a person, you would need to compare two things: their outcome WITH the policy, and their outcome WITHOUT it (the counterfactual). The causal effect is the difference between these two. The potential-outcomes framework (Rubin) makes this precise. For each unit i, there are two potential outcomes: Y_i(1) (outcome if treated) and Y_i(0) (outcome if not treated). The treatment effect for i is Y_i(1) − Y_i(0). The fundamental problem (Holland, 1986): for any given unit, we can only ever observe ONE of these — either they got the treatment or they didn't, never both. The counterfactual is, by definition, unobservable. So the individual treatment effect can never be directly computed; it must be INFERRED by finding a credible stand-in for the missing counterfactual. The whole of impact evaluation is the search for a credible counterfactual — a group that tells us what would have happened to the treated in the absence of treatment.
Why the naive comparisons fail
- Before-after (pre-post) — comparing the treated group's outcome AFTER the policy to BEFORE it. The problem: other things change over time (the economy, the weather, secular trends), so the before-after difference confounds the policy's effect with everything else that happened. A training programme's graduates earn more than before — but so did everyone, because the economy grew.
- Treated vs untreated (cross-sectional) — comparing those who got the treatment to those who didn't. The problem: the two groups DIFFER in ways that affect outcomes — and those differences, not the treatment, may explain the gap. This is selection bias.
Selection bias
The central enemy
Selection bias arises because the units that receive a treatment are systematically DIFFERENT from those that don't, in ways that affect the outcome — so a simple comparison of treated and untreated reflects both the treatment effect AND the pre-existing differences (the selection), and you cannot separate them. People who ENROL in a training programme are more motivated, more able, or more connected than those who don't — so they would have earned more anyway, even without the training. People who USE a clinic are sicker (or healthier) than those who don't. Firms that ADOPT a technology are better-managed. In every case, the treated and untreated differ at baseline, so the raw difference in outcomes is a biased estimate of the effect — usually it's not even clear in which direction. Selection bias is the central enemy of causal inference, and every method in this course is, at bottom, a strategy for eliminating it by constructing a control group that is comparable to the treated except for the treatment.
What we can estimate
Since individual effects are unobservable, impact evaluation targets AVERAGE effects across a population. The average treatment effect (ATE) is the average of Y(1) − Y(0) across everyone; the average treatment effect on the treated (ATT) is the average effect among those who actually got the treatment (often the policy-relevant quantity). These averages CAN be estimated — IF we can find a credible counterfactual for the treated group, i.e., a control group whose average Y(0) validly stands in for what the treated group's average outcome would have been without treatment. The art of impact evaluation is constructing that credible control group, which is exactly what randomisation (next module) does so powerfully and what the quasi-experimental methods (later modules) attempt when randomisation isn't possible.
Exercise
A microfinance organisation reports that clients who took its loans saw their businesses grow 20% while non-clients grew only 5%, and claims this 15-point gap as its impact. (1) Explain why this comparison does not establish the loans' causal effect. (2) Identify the specific selection bias at work and its likely direction. (3) State what the ideal (unobservable) counterfactual would be. (4) Explain why a before-after comparison of clients alone would also fail.