Factor variables, interactions, and margins — Stata Module 8

Stata's factor variable notation — i. for categorical, c. for continuous, ## for full interactions, # for interaction-only — makes complex specifications concise. The margins command then translates the resulting coefficient table into human-readable marginal effects.

i. and c. prefixes

stata

regress y i.year                    // year as categorical (one dummy per level)
regress y c.x                       // explicit continuous (default for numerics)
regress y i.year c.x

Interactions: ## and #

stata

regress y i.tier##c.assets          // main effects + interaction
regress y i.tier#c.assets           // interaction only
regress y c.x##c.x                  // x and x-squared (using the same variable twice)

Reference category

By default, the lowest level is the reference. To change: ib2.year (use level 2 as reference) or ib(last).year (use the last level).

stata

regress y ib2024.year c.x           // 2024 is reference

margins — what coefficients actually mean

margins computes predicted values, marginal effects, and contrasts at specified covariate values. It is the bridge from a coefficient table to a paragraph of policy-relevant text.

stata

regress lending_rate i.tier##c.assets

margins tier                                  // average predicted rate by tier
margins tier, dydx(assets)                    // marginal effect of assets, by tier
margins, dydx(*)                              // average marginal effects of all
margins tier, at(assets = (500 1000 2000))

marginsplot — visualising the effects

stata

margins tier, at(assets = (200(200)2000))
marginsplot, recast(line) recastci(rline)

Factor variable notation is a force multiplier

Once you internalise i., c., ##, #, and the margins command, you can specify and interpret models that would take dozens of lines in other languages. This is why Stata persists in econometric work.

Exercise

Regress lending_rate on i.year and c.deposit_rate, then compute margins by year.