Skip to content
Module 08 of 855 min readIntermediate

Field experiments in behavioural econ

RCTs, stepped-wedge, the J-PAL evidence base, methodological limits, designing a behavioural-finance evaluation.

100%

Listen along

Read “Field experiments in behavioural econ” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Learning objectives

By the end of this module, you should be able to:

  • 01Recognise the principal field-experimental designs in development economics
  • 02Distinguish efficacy from effectiveness studies
  • 03Apply the J-PAL evidence base to evaluate a behavioural-finance intervention
  • 04Identify the methodological limits and open questions

Field experiments — randomised controlled trials run in real-world settings — have produced most of the rigorous empirical evidence in behavioural-development economics over the past 20 years. This module covers the methodology, the canonical findings, and the open questions that the field is still working through.

Field-experimental designs

  • Individual-level RCT — random assignment at the individual level. Best for individually-administered interventions (commitment products, savings reminders)
  • Cluster-randomised — assignment at group level (villages, schools, classes). Used when spillover or coordination matters within the cluster
  • Stepped-wedge — different clusters start treatment at different times; allows everyone to eventually receive treatment. Useful when ethical or political concerns prevent permanent treatment-control split
  • Encouragement design — random encouragement to take up a product/programme, not random treatment itself. Used when actual treatment can't be randomised but encouragement can (intent-to-treat analysis)

Efficacy vs effectiveness

  • Efficacy studies — does the intervention work under ideal conditions? Implemented by experienced researchers, in selected high-quality contexts. Internal validity is high; external validity is uncertain
  • Effectiveness studies — does the intervention work when implemented at scale by typical operators? Implemented by government agencies, NGOs at scale. External validity is the target

Most behavioural-finance field experiments are efficacy studies — researchers in carefully selected contexts. Effectiveness at scale through government implementation is a separate question, often with reduced effect sizes.

Canonical behavioural-finance field experiments

Karlan-McConnell-Mullainathan-Zinman (2016) — Savings reminders

Six countries (Bolivia, Peru, Philippines, India, Indonesia, etc.). Random assignment of borrowers to receive periodic SMS reminders about savings goals. Result: 6-16% increase in savings rates over 12 months among those who received reminders. Modest individual effect; large aggregate effect over millions of customers.

Beaman-Karlan-Thuysbaert-Udry (2014) — Cash transfers + financial literacy

Mali. Random assignment of agricultural inputs + financial-literacy training. Treatment-group households increased income 30%+. Decomposition: most of the gain was from inputs, not from financial-literacy training per se.

Dupas-Robinson (2013) — Mental accounting and commitment savings

Western Kenya. RCT of simple lockbox savings devices. Treatment increased savings rates by 66% and led to more business investment. Mechanism: the device created a mental-accounting compartment and a commitment device simultaneously.

Brune-Giné-Goldberg-Yang (2016) — Commitment savings

Malawi. RCT of commitment savings accounts tied to harvest income. Treatment increased savings by 60% and led to 22% higher subsequent harvest because farmers could afford more inputs.

Karlan-Mullainathan-Roth (2019) — Meta-analysis

Meta-analysis of 22 commitment-savings RCTs across 9 countries (mostly developing). Common findings:

  • Take-up rates 20-40% across studies
  • Among takers, savings increases 50-90%
  • Sustained effects: 1-3 year persistence after intervention
  • Heterogeneous effects: men benefit less than women in some studies; effects stronger for sophisticated time-inconsistent agents

Open methodological questions

  • External validity — does an effect demonstrated in Kenya replicate in Tanzania? In urban vs rural? In a different macroeconomic environment? The literature is converging on 'often yes, sometimes no' — context matters, but mechanisms can be transported
  • Implementation fidelity at scale — government-run programmes don't always deliver the same effect as researcher-supervised pilots. Microcredit scaled imperfectly; deworming scaled with some implementation degradation
  • Heterogeneous effects — average treatment effect can mask substantial heterogeneity. Knowing the average doesn't tell you which subset of users benefit
  • Long-run effects — most studies have 1-3 year follow-up. Longer-run effects (5-10 years) are less studied but often substantial. Miguel-Kremer deworming showed 10-15 year wage effects

Applications to African behavioural-finance design

Practical implications for product design and policy:

  • Commitment savings products work — at scale, well-designed commitment savings products produce 30-60% savings increases in target populations
  • Default-design captures most users — auto-enrolment with opt-out captures 70-90% of eligible users
  • Savings reminders are cheap and modestly effective — should be part of any savings-product design
  • Insurance subsidies + bundled enrolment work better than voluntary insurance markets at low income levels
  • Goal-anchored savings products outperform unlabelled savings — make the labelling explicit and visible

What the field experiments don't answer

Important limitations:

  • Macroeconomic effects — RCTs measure individual-level impact. Aggregate effects (would universal HSNP raise inflation? Would universal SHIF improve population health?) require structural macroeconomic modelling, not RCTs
  • Political economy — what level of public spending is the right level? RCTs can show this programme works at this scale; can't tell you about budget-share priority across competing uses
  • Institutional change — institutional reforms can't be randomly assigned. Some of the most important development outcomes (rule of law, regulatory quality) are studied with non-experimental methods
  • Cultural and contextual variation — RCTs are field-specific. Whether findings generalise across cultural contexts is empirical question, not theoretical certainty

Exercise

You're a development-finance researcher commissioned by the Kenyan SACCO Society Regulatory Authority (SASRA) to design an evaluation of a new commitment-savings product to be rolled out across 50 SACCOs (~500,000 members). Your goal: rigorous evaluation of whether the product increases members' savings and welfare. Sasra wants results in 18 months. Available budget: KES 50 million. Constraints: members can't be denied access to the product (ethics and political concerns); SASRA wants to know if the product should be expanded to all 250+ Kenyan SACCOs. Design the evaluation.

Key takeaways

  • Field-experimental designs (RCTs, stepped-wedge, encouragement designs) produce most rigorous behavioural-development evidence
  • Canonical findings: commitment savings, default enrolment, savings reminders, and goal-anchored products all produce robust positive effects
  • Efficacy at small scale and effectiveness at large scale aren't the same; implementation fidelity matters
  • Limitations: macroeconomic effects, institutional change, and long-run cultural variation aren't captured well by individual-level RCTs

Further reading

  1. 01

    Field Experiments in Development Economics

    Esther Duflo · Brookings Papers on Economic Activity · 2006Duflo's accessible methodology paper. Won her a share of the Nobel Prize alongside Banerjee and Kremer. The starting point for understanding the methodology.

  2. 02

    Save More Tomorrow

    Richard Thaler and Shlomo Benartzi · Journal of Political Economy 112(S1) · 2004The original 'auto-escalation' pension product. Influential in US 401(k) reform; the principles apply to African pension and SACCO design.

Loading progress…
LeadAfrikPublic Economics Hub