Skip to content
Module 08 of 1255 min readIntermediate

Factor variables, interactions, and margins

i. for categorical variables, c. for continuous, ## for interactions, and the margins command that makes nonlinear effects readable.

67%

Listen along

Read “Factor variables, interactions, and margins” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Learning objectives

By the end of this module, you should be able to:

  • 01Use i. for categorical predictors, c. for continuous, ## for full interactions, # for interaction-only
  • 02Change reference categories with ib# or ib(last) notation
  • 03Compute average marginal effects, predicted values, and contrasts using margins
  • 04Visualise marginal effects with marginsplot

Stata's factor variable notation — i. for categorical, c. for continuous, ## for full interactions, # for interaction-only — makes complex specifications concise. The margins command then translates the resulting coefficient table into human-readable marginal effects.

i. and c. prefixes

stata
regress y i.year // year as categorical (one dummy per level)
regress y c.x // explicit continuous (default for numerics)
regress y i.year c.x

Interactions: ## and #

stata
regress y i.tier##c.assets // main effects + interaction
regress y i.tier#c.assets // interaction only
regress y c.x##c.x // x and x-squared (using the same variable twice)

Reference category

By default, the lowest level is the reference. To change: ib2.year (use level 2 as reference) or ib(last).year (use the last level).

stata
regress y ib2024.year c.x // 2024 is reference

margins — what coefficients actually mean

margins computes predicted values, marginal effects, and contrasts at specified covariate values. It is the bridge from a coefficient table to a paragraph of policy-relevant text.

stata
regress lending_rate i.tier##c.assets
margins tier // average predicted rate by tier
margins tier, dydx(assets) // marginal effect of assets, by tier
margins, dydx(*) // average marginal effects of all
margins tier, at(assets = (500 1000 2000))

marginsplot — visualising the effects

stata
margins tier, at(assets = (200(200)2000))
marginsplot, recast(line) recastci(rline)

Factor variable notation is a force multiplier

Once you internalise i., c., ##, #, and the margins command, you can specify and interpret models that would take dozens of lines in other languages. This is why Stata persists in econometric work.

Exercise

Regress lending_rate on i.year and c.deposit_rate, then compute margins by year.

Key takeaways

  • i.var creates one dummy per non-reference level; c.var marks an explicitly continuous variable
  • i.tier##c.assets expands to main effects plus interaction — saves typing for full-factorial specs
  • margins is the bridge from coefficient table to policy-relevant interpretation
  • ib2.year sets year 2 as the reference; ib(last).year uses the last level

Further reading

  1. 01
  2. 02

    Using Stata for Quantitative Analysis (3rd Edition)

    Kyle Longest · Sage · 2019

  3. 03
Loading progress…
LeadAfrikPublic Economics Hub