Standard errors quantify how much an estimate would jitter under repeated sampling. They feed t-statistics, p-values, and confidence intervals — every claim of statistical significance rests on getting them right.
What a p-value actually says
A p-value is the probability of observing a test statistic at least as extreme as the one you got, IF the null hypothesis is true. That's it. It is NOT:
- The probability the null is true (that's a Bayesian quantity)
- The probability your finding is real
- 1 minus statistical power
p < 0.05 ≠ true
If 100 researchers test a null that's actually true, ~5 will reject it at the 5% level — that's the design of the test. With selective publication, journals fill with the lucky 5. The replication crisis is partly this.
Heteroskedasticity-robust standard errors
The classic SE formula assumes homoskedasticity. Real data almost never has it. White (1980) gave us a heteroskedasticity-consistent estimator that requires only large samples and exogeneity:
Var(β̂) = (X'X)⁻¹ X' diag(û²) X (X'X)⁻¹
In Stata: , robust. In R: vcovHC() from sandwich. In statsmodels: cov_type='HC3'. Use it by default. The cost in efficiency is small; the cost of getting SEs wrong is large.
Cluster-robust standard errors
When observations are correlated within groups (students within schools, workers within firms, observations within country-years), independent-observation SEs lie. Cluster the SEs at the level of meaningful correlation:
- Cluster at the highest level of meaningful correlation
- Need ~30+ clusters for asymptotic results to apply
- With few clusters, use wild bootstrap (Cameron-Gelbach-Miller 2008)
The bootstrap
Resample your data with replacement many times (≥1,000), re-estimate β̂ each time. The standard deviation across the bootstrap replications is your SE. Doesn't require closed-form analytics; works for almost any estimator.
When to bootstrap
When you can't write down an analytical SE — chained estimators, complex weighting, ratios of estimates. The bootstrap is also a useful sanity check on analytical SEs that look surprising.
Confidence intervals
A 95% CI for β: β̂ ± 1.96 × SE. Interpretation: 'in repeated sampling, 95% of intervals constructed this way would contain the true β.' Not 'there's a 95% chance the true β is in this interval' — that's Bayesian language for a frequentist quantity.
Exercise
You estimate β̂ = 0.50, SE = 0.20, robust. Compute the t-stat, give a rough p-value, and a 95% CI.