Skip to content
Module 11 of 1250 min readIntermediate

Bootstrap and resampling

Nonparametric inference. Percentile, BCa, block bootstrap for time-series. When the bootstrap saves you and when it doesn't.

92%

Listen along

Read “Bootstrap and resampling” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Bootstrapping is the modern Swiss-army knife of statistical inference. Given any data sample and any statistic, it gives standard errors, confidence intervals, and hypothesis tests — without distributional assumptions. Introduced by Efron (1979), it has become the practical default whenever closed-form asymptotic results are unavailable or untrusted.

The basic idea

We have a sample x = (x₁, ..., xₙ). We compute a statistic θ̂(x). We want to know the sampling distribution of θ̂. The bootstrap pretends the empirical distribution is the population: draw B bootstrap samples x*₁, ..., x*_B by resampling with replacement from x; compute θ̂(x*_b) for each; the empirical distribution of {θ̂(x*_b)} approximates the sampling distribution of θ̂.

Three flavours of bootstrap CI

  • Percentile: take the 2.5th and 97.5th percentiles of the bootstrap distribution. Simple, biased if the distribution is skewed.
  • Basic (reflection): 2θ̂ - q_(0.975) and 2θ̂ - q_(0.025). Corrects for the bias direction.
  • BCa (bias-corrected and accelerated, Efron 1987): the gold standard for general use. Adjusts for bias and skewness.

Block bootstrap for time series

I.i.d. bootstrap destroys the dependence structure of time-series data. Block bootstrap resamples contiguous blocks of length L, preserving short-range dependence. Variants: non-overlapping blocks (Carlstein), overlapping blocks (Künsch), stationary bootstrap (Politis-Romano) with random block length.

Choosing the block length

Rule of thumb: L ≈ n^(1/3) for stationary bootstrap, or L set by an AR(1) autocorrelation: L ≈ -log(0.05) / log(|ρ̂|) where ρ̂ is the sample lag-1 autocorrelation of the series of interest. Excessive L wastes power; too-small L misses persistence.

Parametric bootstrap

Fit a parametric model, draw simulated samples from the fitted model, compute the statistic on each. Useful when the model is plausible and the sample is small; combines distributional structure with simulation.

When bootstrap fails

  • Heavy tails with infinite variance: bootstrap variance estimates can be misleading.
  • Estimators of the maximum/minimum: the empirical maximum can never exceed the sample max, biasing the bootstrap distribution of the maximum.
  • Boundary parameters: variance components near zero, restricted regressions.
  • Strong dependence not captured by block size: long memory series need m-out-of-n bootstrap or subsampling.

Implementation

python
import numpy as np
def bootstrap_sharpe(returns, n_bootstrap=10000):
n = len(returns)
sharpes = np.empty(n_bootstrap)
for b in range(n_bootstrap):
sample = np.random.choice(returns, size=n, replace=True)
sharpes[b] = sample.mean() / sample.std(ddof=1) * np.sqrt(252)
return sharpes
ci_low, ci_high = np.percentile(sharpes, [2.5, 97.5])

Subsampling — when bootstrap fails

Subsampling (Politis-Romano-Wolf): draw subsamples of size m < n without replacement, compute the statistic, scale by √m/n. Valid under much weaker conditions than the bootstrap, particularly for non-smooth statistics and certain dependent data.

Exercise

A strategy has 504 daily returns (2 years). You bootstrap the Sharpe ratio with 10,000 i.i.d. resamples. (1) Why is the i.i.d. bootstrap potentially wrong for this application? (2) Outline a correct procedure. (3) Suppose the resulting 95% CI for annualised Sharpe is [0.4, 1.8]. Interpret.

Loading progress…
LeadAfrikPublic Economics Hub