Statistics 5601 (Geyer, Fall 2007) Bootstrap Percentile

General Instructions

To do each example, just click the Submit button. You do not have to type in any R instructions or specify a dataset. That's already done for you.

Theory

The confidence intervals discussed on this web page are only first order correct, and that only under certain assumptions.

These intervals have been disparaged by other authors. Hall (The Bootstrap and Edgeworth Expansion, Springer, 1992, and earlier papers) charactizes bootstrap percentile intervals as looking up [in] the wrong statistical tables backwards.

Efron and Tibshirani's defense to this sort of criticism is their Section 13.4 titled Is the Percentile Interval Backwards?.

Clearly, Efron and Tibshirani do have some arguments defending these intervals. If there exists a normalizing and variance stabilizing transformation φ = m(θ) as in their percentile interval lemma (p. 173), then the percentile interval does the right thing. When no such transformation exists, it may not. These arguments are not convincing to all theoreticians. The sad fact is that sometimes these intervals work and sometimes they don't.

In particular, when the bootstrap distribution of the estimator seems very biased in that the bootstrap estimator theta.star is below the estimator applied to the original data theta.hat, then percentile intervals seem to make the wrong correction for bias. The percentile interval will be mostly below theta.hat whereas a bootstrap t interval will be mostly above.

Bootstrap Percentile Intervals

Section 13.3 in Efron and Tibshirani.

Comments

Everything down to the bottom of the for loop should be familiar, just like what we did calculating bootstrap standard errors (sd(theta.star) would be the bootstrap standard error).
The R function quantile (on-line help) calculates quantiles of a data vector, at least, what it calls quantiles. Its definition is a bit eccentric, but is asymptotically equivalent to all other definitions of quantiles.
If you want, for example, a 90% equal-tailed confidence interval, you replace the definition of conf.level by
```
conf.level <- 0.90
```

Bootstrap Percentile Intervals, Take Two

An alternative method for quantiles preferred by your humble instructor uses the following logic.

Use nboot <- 999 (or some other value such that nboot + 1 is a round number. The reason is that if X_(i) is the i-th order statistic from a Uniform(0, 1) distribution

E{X_(i)} = i / (n + 1)

Another way to think of this is that the nboot data points divide the number line into nboot + 1 intervals, which as far as we know contain equal probability. They don't contain equal probability because the sample is not the population, but we might as well treat them as such for the purposes of estimation. That is, our nboot data points should be taken as estimators of the quantiles with denominators nboot + 1

In particular, if nboot is 999, then we take the ordered theta.star values to be the 0.001, 0.002, . . ., 0.999 quantiles of the sampling distribution of theta.hat. Thus