Data quality, ethics, and reproducibility — Measurement Module 8

The course — and the whole program — ends where it should: with the standards that make data trustworthy. Good measurement is not just good technique; it is honest, ethical, and reproducible practice. This final module covers measurement error, research ethics, and the reproducibility crisis and its remedies — and closes by connecting the unglamorous foundation of good data to everything built on it across the program.

Measurement error

Classical vs non-classical error

Measurement error — the gap between the recorded value and the true value — is pervasive (recall error, mis-reporting, enumerator error, proxy imperfection). Its effect depends on its TYPE: • Classical measurement error — random noise UNCORRELATED with the true value (and with everything else). In an OUTCOME variable, it adds noise but doesn't bias the estimate (just reduces precision). In an EXPLANATORY variable (a regressor), it causes ATTENUATION bias — it biases the estimated effect TOWARD ZERO (the classic 'errors-in-variables' result), so you understate relationships. • Non-classical measurement error — error CORRELATED with the true value or with other variables (e.g., the rich systematically under-report income MORE than the poor; people over-report socially-desirable behaviour). This is more dangerous: it can bias estimates in ANY direction, and it cannot be assumed to wash out. Most real measurement error in surveys is non-classical (systematic under-reporting of income, assets, stigmatised behaviour), which is why understanding HOW the data were measured (the whole course) is essential to knowing which way the errors bias your conclusions. The lesson: measurement error is not just 'noise' to be ignored — its structure (classical vs non-classical, in the outcome vs a regressor) determines whether and how it biases your results, and good analysis takes it seriously (validation, multiple measures, the total-survey-error framework that accounts for all error sources).

Research ethics

Collecting data on people carries ethical obligations. The core principles: INFORMED CONSENT (subjects understand the research and agree to participate, freely); CONFIDENTIALITY and anonymisation (protecting subjects' identities and sensitive data — acute for the personal data of module 7); DO NO HARM (the research must not harm subjects — physically, psychologically, socially, or by exposing them to risk); and respect for persons (especially VULNERABLE subjects — the poor, children, the marginalised, who may have less power to refuse). These are enforced through ETHICS REVIEW (Institutional Review Boards / ethics committees that must approve research before it proceeds). Development research raises specific issues: power asymmetries between (often foreign, well-resourced) researchers and (poor, less-powerful) subjects; the ethics of RCTs (withholding treatment — the equipoise and scarcity arguments of the Impact Evaluation course); consent in low-literacy or coercive contexts; and the use of sensitive personal data (module 7's phone/financial records). Ethical data practice is not optional bureaucracy — it is a fundamental obligation, and the history of research abuses is why ethics review exists. The do-no-harm principle and genuine informed consent are non-negotiable.

The reproducibility crisis

The credibility crisis and its remedies

Across the social and biomedical sciences, a REPRODUCIBILITY/REPLICATION CRISIS emerged: many published findings FAIL TO REPLICATE when re-tested, revealing that a substantial share of the published literature may be false positives. The causes connect to the Impact Evaluation course: P-HACKING and specification search (trying many analyses and reporting the ones that 'work' — with enough tests, some are significant by chance); PUBLICATION BIAS (journals favour significant, surprising results, so the literature over-represents flukes); selective reporting; and outright error or fraud. The crisis is serious because policy built on findings that don't replicate is built on sand. The responses (the 'credibility' or 'open science' movement): PRE-REGISTRATION / pre-analysis plans (specifying the analysis before seeing the data — the Impact Evaluation course — so results can't be cherry-picked); OPEN DATA and CODE (publishing the data and analysis code so others can REPRODUCE the results and check them); transparent DOCUMENTATION and metadata (so others can understand and reuse the data); registered reports (peer review BEFORE results are known); and reproducibility checks (journals requiring code that runs). The movement (Christensen-Miguel and others) has substantially changed norms in economics — pre-registration, open data, and replication are increasingly required. The principle: transparent, reproducible, pre-registered research is more credible than results you have to take on faith, and the reforms make science self-correcting. There is a tension with PRIVACY (open data vs protecting sensitive subjects — module 7), managed through anonymisation and controlled access.

The foundation everything rests on

This closes the course and the program. Every policy number — the GDP in a debt ratio, the poverty rate in a targeting decision, the effect size in an evaluation, the index in a ranking, the inequality figure in a debate — rests on data that was measured, sampled, aggregated, and analysed through the choices this course has covered. If the measurement is bad, the dishonest, or the analysis irreproducible, the policy built on it is unsound, however sophisticated the method. So good measurement and honest, ethical, reproducible data practice are the UNGLAMOROUS FOUNDATION on which the entire edifice of public economics stands — connecting back to module 1's premise (methods can't rescue bad data) and forward to every substantive question in the program. The Tax Policy course's revenue numbers, the Public Budgeting course's spending data, the Sovereign Debt course's debt statistics, the Development course's poverty and growth measures, the Impact Evaluation course's effect estimates, the Governance course's corruption and capacity measures — all depend on the data foundation this course is about. The final lesson of the Public Economics Program is therefore a humble and essential one: take measurement seriously, be honest about what the data can and cannot tell you, practise ethically and reproducibly, and remember that behind every confident policy number is a chain of measurement choices that the responsible analyst makes visible. Good policy rests on good data, honestly handled — that is where the whole program comes to ground.

Exercise

A widely-cited study claims a development intervention has a large positive effect, based on survey data; it is now influencing major policy. A replication attempt fails to reproduce the result. (1) Identify the likely measurement and methodological reasons the original might be wrong. (2) Explain how pre-registration and open data/code would have helped. (3) Identify the ethical obligations the original data collection had. (4) Explain why this episode illustrates the program's final lesson about data as the foundation.