To do each example, just click the "Submit" button. You do not have to type in any R instructions or specify a dataset. That's already done for you.
library(bootstrap)
says we are going to
use code in the bootstrap
library, which is not available
without this command. Here library(bootstrap)
is necessary
for two reasons. Without it we can't get the data mouse.c
and we also can't get the function boott
(on-line
help) that we use to
construct bootstrap t intervals.
theta
is a function that calculates the
point estimate on which the interval is based. Here the point estimate
is the sample mean, calculated by the function mean
.
sdfun
calculates an estimate of the
standard error of the estimator calculated by theta
.
In this example the sample mean
(calculated by the mean
function) has standard error
calculated by sdfun
.
sdfun
is specified in the
documentation for boott
. It must have exactly this form, with three
specified arguments and the dot, dot, dotargument.
x
in our
definition of sdfun
. We can also use some of the others, but they
are not so useful.
nbootsd
is a dummy argument that is not used.
theta
is the function that we passed in via the
theta
argument of boott
(in this example, mean
). We generally do not need to use
the theta
argument though, because we know the function:
we can say mean(x)
instead of theta(x)
.
...
indicates that other named arguments to the function
are allowed. But we can also use global variables rather than arguments.
nboott = 1000
is because of the of the comment in
the
documentation for boott
and the similar comment near the top of p. 161 in Efron and Tibshirani.200 is a bare minimum and 1000 or more is needed for reliable α % confidence points, α > .95 say.
This is exactly the same as the preceding section but we make the output be only the confidence interval we want.
sdfun
argument, then the standardization
is done another way -- via the bootstrap. In each outerbootstrap iteration a sample
x starwith replacement from the data is formed, the function
theta
is applied to it to get theta star, the the standard error is estimated from
x starby an
innerbootstrap that forms samples
x star starwith replacement from
x starand applies
theta
to them to get a bootstrap
standard error.
This function with sdfun
omitted behaves as if we had supplied
an sdfun
of the following form
sdfun <- function(x, nbootsd, theta, ...) { theta.star <- double(nbootsd) for (i in 1:nbootsd) { x.star <- sample(x, replace = TRUE) theta.star[i] <- theta(x.star) } return(sd(theta.star)) }
Note that when this sdfun
is called with the original data
x
as an argument it just calculates the bootstrap standard error
just like we did on our web page on that subject.
But that when this sdfun
is called with
bootstrap data x.start
as its argument x
,
then every x.star
inside the function is a random sample without
replacement from another random sample without replacement.
Hence the name double bootstrap
.
nboott = 1000
iterations of the outer loop,
the are nbootsd = 200
iterations of the inner loop, for
1000 * 200 = 2e5
iterations in all.
If you get bored, you can use the defaults (omit nboott
and nbootsd
), but you won't get as accurate an answer.
Another method of automagic variance stabilization estimates a variance stabilizing transformation using the double bootstrap.
sally
here calculates the estimator.
It uses a trick from
documentation for boott
. It's a trick we have been using all along.
For complicated data structures, sample the indices 1, . . ., n
rather than the data vectors. So we just supply 1:n
as
the argument to boott
and write the estimation function
(here sally
to take the (resampled) indices as an argument.
This is exactly the same as the preceding section but we make the output be only the confidence interval we want, no plot showing the variance-stabilizing transformation.