Standard errors, p-values, and confidence — Econometrics Module 4

Standard errors quantify how much an estimate would jitter under repeated sampling. They feed t-statistics, p-values, and confidence intervals — every claim of statistical significance rests on getting them right.

What a p-value actually says

A p-value is the probability of observing a test statistic at least as extreme as the one you got, IF the null hypothesis is true. That's it. It is NOT:

The probability the null is true (that's a Bayesian quantity)
The probability your finding is real
1 minus statistical power

p < 0.05 ≠ true

If 100 researchers test a null that's actually true, ~5 will reject it at the 5% level — that's the design of the test. With selective publication, journals fill with the lucky 5. The replication crisis is partly this.

Heteroskedasticity-robust standard errors

The classic SE formula assumes homoskedasticity. Real data almost never has it. White (1980) gave us a heteroskedasticity-consistent estimator that requires only large samples and exogeneity:

math

Var(β̂) = (X'X)⁻¹ X' diag(û²) X (X'X)⁻¹

In Stata: , robust. In R: vcovHC() from sandwich. In statsmodels: cov_type='HC3'. Use it by default. The cost in efficiency is small; the cost of getting SEs wrong is large.

Cluster-robust standard errors

When observations are correlated within groups (students within schools, workers within firms, observations within country-years), independent-observation SEs lie. Cluster the SEs at the level of meaningful correlation:

Cluster at the highest level of meaningful correlation
Need ~30+ clusters for asymptotic results to apply
With few clusters, use wild bootstrap (Cameron-Gelbach-Miller 2008)

The bootstrap

Resample your data with replacement many times (≥1,000), re-estimate β̂ each time. The standard deviation across the bootstrap replications is your SE. Doesn't require closed-form analytics; works for almost any estimator.

When to bootstrap

When you can't write down an analytical SE — chained estimators, complex weighting, ratios of estimates. The bootstrap is also a useful sanity check on analytical SEs that look surprising.

Confidence intervals

A 95% CI for β: β̂ ± 1.96 × SE. Interpretation: 'in repeated sampling, 95% of intervals constructed this way would contain the true β.' Not 'there's a 95% chance the true β is in this interval' — that's Bayesian language for a frequentist quantity.

Exercise

You estimate β̂ = 0.50, SE = 0.20, robust. Compute the t-stat, give a rough p-value, and a 95% CI.