To do each example, just click the
You do not have to type in any R instructions or specify a dataset.
That's already done for you.
The confidence intervals discussed on this web page are only first order correct, and that only under certain assumptions.
These intervals have been disparaged by other authors.
Hall (The Bootstrap and Edgeworth Expansion, Springer, 1992,
and earlier papers) charactizes bootstrap percentile intervals
looking up [in] the wrong statistical tables backwards.
Efron and Tibshirani's defense to this sort of criticism is their
Section 13.4 titled
Is the Percentile Interval Backwards?.
Clearly, Efron and Tibshirani do have some arguments defending these intervals.
If there exists a normalizing and variance stabilizing transformation
φ = m(θ) as in their
percentile interval lemma
(p. 173), then the percentile interval does the right thing.
When no such transformation exists, it may not.
These arguments are not convincing to all theoreticians.
The sad fact is that sometimes these intervals work and sometimes they don't.
In particular, when the bootstrap distribution of the estimator seems
very biased in that the bootstrap estimator
below the estimator applied to the original data
then percentile intervals seem to make the wrong correction for bias.
The percentile interval will be mostly below
whereas a bootstrap t interval will be mostly above.
Bootstrap Percentile Intervals
Section 13.3 in Efron and Tibshirani.
- Everything down to the bottom of the
forloop should be familiar, just like what we did calculating bootstrap standard errors (
sd(theta.star)would be the bootstrap standard error).
- The R function
quantile(on-line help) calculates quantiles of a data vector, at least, what it calls quantiles. Its definition is a bit eccentric, but is asymptotically equivalent to all other definitions of quantiles.
- If you want, for example, a 90% equal-tailed confidence interval, you
replace the definition of
conf.level <- 0.90
Bootstrap Percentile Intervals, Take Two
An alternative method for quantiles preferred by your humble instructor uses the following logic.
nboot <- 999 (or some other
value such that
nboot + 1 is a round number. The reason is
that if X(i) is the i-th order
statistic from a Uniform(0, 1) distribution
Another way to think of this is that the
nboot data points
divide the number line into
nboot + 1 intervals, which as
far as we know contain equal probability. They don't contain equal
probability because the sample is not the population, but we might as well
treat them as such for the purposes of estimation. That is,
nboot data points should be taken as estimators of the
quantiles with denominators
nboot + 1
In particular, if
nboot is 999, then we take
theta.star values to be the 0.001, 0.002, . . .,
0.999 quantiles of the sampling distribution of