Stata's factor variable notation — i. for categorical, c. for continuous, ## for full interactions, # for interaction-only — makes complex specifications concise. The margins command then translates the resulting coefficient table into human-readable marginal effects.
i. and c. prefixes
regress y i.year // year as categorical (one dummy per level)regress y c.x // explicit continuous (default for numerics)regress y i.year c.x
Interactions: ## and #
regress y i.tier##c.assets // main effects + interactionregress y i.tier#c.assets // interaction onlyregress y c.x##c.x // x and x-squared (using the same variable twice)
Reference category
By default, the lowest level is the reference. To change: ib2.year (use level 2 as reference) or ib(last).year (use the last level).
regress y ib2024.year c.x // 2024 is reference
margins — what coefficients actually mean
margins computes predicted values, marginal effects, and contrasts at specified covariate values. It is the bridge from a coefficient table to a paragraph of policy-relevant text.
regress lending_rate i.tier##c.assetsmargins tier // average predicted rate by tiermargins tier, dydx(assets) // marginal effect of assets, by tiermargins, dydx(*) // average marginal effects of allmargins tier, at(assets = (500 1000 2000))
marginsplot — visualising the effects
margins tier, at(assets = (200(200)2000))marginsplot, recast(line) recastci(rline)
Factor variable notation is a force multiplier
Once you internalise i., c., ##, #, and the margins command, you can specify and interpret models that would take dozens of lines in other languages. This is why Stata persists in econometric work.
Exercise
Regress lending_rate on i.year and c.deposit_rate, then compute margins by year.