University of Minnesota, Twin Cities School of Statistics Stat 5601 Rweb
library(bootstrap)
says we are going to
use code in the bootstrap
library, which is not available
without this command.
z[-1]
is all the elements of z
except the first and the vector z[-n]
is all the elements of
z
except the last.
Thus the statement
out <- lm(z[-1] ~ z[-n] + 0)regresses z_{t} on z_{t - 1} with no intercept (the
+ 0
means no intercept).
n
and blocks of blen
there are exactly
n - blen + 1
such blocks. Generally, we use them all.
No need for random samples.
beta.star
shows. However, this does not matter,
so long as blen
is long enough so the samples are representative
of the behavior of the whole series.
As usual, Efron and Tibshirani are using a ridiculously small sample size in this toy problem. There is no reason to believe the subsampling bootstrap here. But it is reasonable for (much) larger data sets.
sqrt(blen / n)
in the last line adjusts for the relative
sample sizes of the subsample and the whole series. Note that
the sqrt
here is only valid for estimators obeying the
square root law. If the rateis not
root n, then a different function of
blen / n
is needed, as in the
following example.
Suppose X_{1}, X_{2}, . . ., X_{n} are independent and identically distributed Uniform(0, θ) random variables. Since the larger the sample the more the largest values crowd up against θ, the natural estimator of θ is the maximum data value X_{(n)}. This is in fact the maximum likelihood estimate.
The main statistical interest in this estimator is that it is a counter example to both the square root law and the usual asymptotics of maximum likelihood.
rateis n rather than root n.
More precisely, (this was proved for homework in my theory class, Problem 10-4 in my lecture notes)
n (θ − X_{(n)})converges in distribution to the Exp(1 / θ) distribution.
But to use the subsampling bootstrap, we need only know that that the square root law fails and the actual rate is n.
xmax.star
stores max(x)
for samples
from the subsampling bootstrap.
xmax.bogo
stores max(x)
for samples
from the ordinary (Efron) bootstrap.
sample
statement is quite different for the
regular (Efron) bootstrap and the (Politis and Romano) subsampling bootstrap.
For the Efron bootstrap, we sample with replacement at the original sample size with something like
x.star <- sample(x, replace = TRUE)
The subsampling bootstrap samples without replacement at the
much smaller sample size nsub
with something like
x.star <- sample(x, nsub, replace = FALSE)
Both the size
and the replace
arguments
of sample
differ.
(For the Efron bootstrap the size
argument is missing so the
default length(x)
is used.)
z.star <- nsub * (xmax - xmax.star)which is supposed to have an Exp(1 / θ) distribution according to the theory, are plotted against the appropriate quantiles of this distribution. If the points lie near the line y = x, then
z.star
does indeed have the claimed distribution.
We emphasize that we don't need to know the asymptotic distribution
to use the bootstrap samples z.star
to construct a confidence
interval for θ. We can't do it yet because we haven't covered
Chapters 12, 13, and 14 in Efron and Tibshirani. When we've done them,
we can return to this example and finish it.
xmax.bogo
samples on the Q-Q
plot, so it can be clearly seen they do the Wrong Thing (with a capital W
and a capital T).