To do each example, just click the "Submit" button. You do not have to type in any R instructions or specify a dataset. That's already done for you.
The subject of this web page is the subsampling bootstrap, which is the subject of a book by Politis, Romano, and Wolfe.
It is also the subject of a more detailed web page, which we will get to in a few weeks.
The subsampling bootstrap samples without replacement
at a subsample
size b that is smaller than the original
sample size n. The sampling without replacement has the
consequence that the samples are from the true unknown population
distribution.
In order to use the subsampling bootstrap we must know the rate of convergence of the estimator we are using. We assume that if t_{n} is the estimator, θ is the parameter, and n is the sample size, then
We estimate this distribution by the distribution of
where b is the subsample size and t_{b}* is the subsampling bootstrap estimator.
Often r is 1 ⁄ 2 (the square root law
obeyed by
most widely used estimators). Sometimes, as in the
extreme values example below, it is not.
There are two ways to do subsampling.
One is essential for stationary time series and is demonstrated in the time series example below. In this method, the subsamples are all blocks of length b in the time series. There are not many such blocks (n − b + 1), but it is necessary to keep the blocks together to keep the dependence in the time series (at least the dependence that is present in blocks of length b).
The other method applies only to IID and is demonstrated in the extreme values example below. In this method, the subsamples are samples without replacement of length b from the original sample. This allows many more samples than the other method and a more accurate bootstrap.
library(bootstrap)
says we are going to
use code in the bootstrap
library, which is not available
without this command.
lutenhorm
data is explained by
its
on-line help.
Inspection of the lutenhorm
dataset shows that column 4
is the data described in Table 8.1 in Efron and Tibshirani.
acf
on-line
help calculates the so-called autocorrelation function of the time series.
The height of the bar at lag k is the correlation of
X_{n}
and
X_{n + k}
assuming the time series is stationary (so this correlation does not depend
on n only on k. The correlation at lag zero is one
by definition (any random variable is perfectly correlated with itself).
The blue dashed lines in the autocorrelation plot are 95% non-simultaneous large sample approximate critical values for testing whether the autocorrelations are non-zero. Autocorrelations that go outside the blue dashed lines are statistically significant. Here only the lag 1 autocorrelation is significant.
foo
calculates the estimator of the
autoregressive coefficient described by Efron and Tibshirani.
The vector z
is the data supplied to the function with
the mean subtracted off. The number m
is the length of
the data.
The vector z[-1]
is all the elements of z
except the first and the vector z[-m]
is all the elements of
z
except the last.
Thus the statement
out <- lm(z[-1] ~ z[-m] + 0)regresses z_{t} on z_{t - 1} with no intercept (the
+ 0
means no intercept).
n
and blocks of b
there are exactly
n - b + 1
such blocks. Generally, we use them all.
No need for random samples.
beta.star
shows. However, this does not matter,
so long as b
is long enough so the samples are representative
of the behavior of the whole series.
As usual, Efron and Tibshirani are using a ridiculously small sample size in this toy problem. There is no reason to believe the subsampling bootstrap here. But it is reasonable for (much) larger data sets.
beta.star
shows that the simple method
of estimation being used here is badly biased. That's why this method
is not recommended by time series books. We only use it here because
it is easy to explain.
sqrt(b / n)
in the last line adjusts for the relative
sample sizes of the subsample and the whole series. Note that
the sqrt
here is only valid for estimators obeying the
square root law. If the rateis not
root n, then a different function of
b / n
is needed, as in the
following example.
theta.star
stores max(x)
for samples
from the subsampling bootstrap.
theta.bogo
stores max(x)
for samples
from the ordinary (Efron) bootstrap.
sample
statement is quite different for the
regular (Efron) bootstrap and the (Politis and Romano) subsampling bootstrap.
For the Efron bootstrap, we sample with replacement at the original sample size with something like
x.star <- sample(x, replace = TRUE)
The subsampling bootstrap samples without replacement at the
much smaller sample size b
with something like
x.star <- sample(x, b, replace = FALSE)
Both the size
and the replace
arguments
of sample
differ.
(For the Efron bootstrap the size
argument is missing so the
default length(x)
is used.)
z.star <- b * (theta.hat - theta.star)which is supposed to have an Exp(1 / θ) distribution according to the theory, are plotted against the appropriate quantiles of this distribution. If the points lie near the line y = x, then
z.star
does indeed have the claimed distribution.
We emphasize that we don't need to know the asymptotic distribution
to use the bootstrap samples z.star
to construct a confidence
interval for θ. We can't do it yet because we haven't covered
Chapters 12, 13, and 14 in Efron and Tibshirani. When we've done them,
we can return to this example and finish it.
theta.bogo
samples on the Q-Q
plot, so it can be clearly seen they do the Wrong Thing (with a capital W
and a capital T).