The first ten minutes of any Stata session: load the data, look at it, understand its shape. Skip this and every subsequent error becomes mysterious.
use — Stata's native format
use bankrates.dta, clear* clear discards any data currently in memory
import — bringing in CSV/Excel/etc
import delimited "rates.csv", clearimport excel "data.xlsx", sheet("Sheet1") firstrow clearimport sas "data.sas7bdat", clear
describe and summarize
describe // variable list with typessummarize // numeric summary of all variablessummarize, detail // with percentiles, skewness, kurtosissummarize lending_rate, detail
codebook — the deepest inspection
codebook is the most thorough way to understand a variable: type, range, missing-value count, unique values, frequencies for categorical variables, mean/SD for continuous.
codebookcodebook lending_rate
list — view raw rows
list in 1/10 // first 10 observationslist month lending_rate in 1/10list if lending_rate > 0.13
browse — interactive data view
browse opens a spreadsheet-style view of the data. Useful for visual inspection, never for analysis.
save — persisting your work
save processed.dta, replaceexport delimited "output.csv", replace
describe + codebook + summarize, every time
After every load, run those three. They cost nothing and catch every common data-import problem: wrong types, hidden missing values, encoding errors, decimal vs comma confusion.
Exercise
Load bankrates.dta, run describe and summarize.