Reporting and reproducibility — Stata Module 11 | LeadAfrik Public Economics Hub

Reproducibility in Stata is built around the do-file (your script), the log file (the run record), and the export tools that turn estimates into publication tables.

Logging a run

stata

capture log close
log using analysis_$(today).log, replace

* your analysis here

log close

Locals and globals — parameterising do-files

stata

local controls deposit_rate i.year
regress lending_rate `controls'

global CONTROLS deposit_rate i.year
regress lending_rate $CONTROLS

Loops — foreach and forvalues

stata

foreach var of varlist lending_rate deposit_rate spread {
    summarize `var'
    histogram `var', name(g_`var', replace)
}

forvalues y = 2020/2024 {
    summarize lending_rate if year == `y'
}

esttab and outreg2 — publication tables

stata

ssc install estout, replace

regress lending_rate deposit_rate
estimates store m1
regress lending_rate deposit_rate i.year
estimates store m2
xtreg lending_rate deposit_rate, fe
estimates store m3

esttab m1 m2 m3 using regressions.tex, ///
    cells(b(star fmt(3)) se(par fmt(3))) ///
    stats(N r2 r2_a, fmt(0 3 3) labels("Observations" "R-squared" "Adj R-squared")) ///
    star(* 0.10 ** 0.05 *** 0.01) ///
    label replace

putexcel — custom Excel output

stata

putexcel set output.xlsx, replace
putexcel A1 = "Bank" B1 = "Mean rate"

levelsof bank_id, local(banks)
local row = 2
foreach b of local banks {
    summarize lending_rate if bank_id == `b', meanonly
    putexcel A`row' = `b' B`row' = `r(mean)'
    local row = `row' + 1
}

Reproducibility checklist

(1) Everything in a do-file. (2) Log every run. (3) Locals/globals at the top for paths and parameters. (4) esttab for tables, graph export for figures, putexcel for custom outputs. (5) Save intermediate datasets at each major step. With those five, you can rerun a year-old analysis in five minutes.

Exercise

Define a local 'controls' equal to deposit_rate i.year, then use it in a regress.