General Instructions
To do each example, just click the Submit
button.
You do not have to type in any R instructions or specify a dataset.
That's already done for you.
Bias Estimation
Section 10.3 in Efron and Tibshirani.
Comments
Everything pretty obvious here.
 The function
ratio
calculates the estimator we are investigating.  The bootstrap is just like other bootstraps we have done that
use the
k.star
trick.  The bootstrap estimate of bias is the mean of the
theta.star
minustheta.hat
. This is the obvious analog in thebootstrap world
of the actual bias, which is the mean oftheta.hat
minus the true unknown parameter valuetheta
.
Improved
Bias Estimation
Section 10.4 in Efron and Tibshirani.
Comments

The main comment is about the rather strange form of the functions
rmean
andrratio
that calculate the ratio estimator.These functions use what Efron and Tibshirani call the resampling vector (pp. 130–132) and the resampling form (pp. 189–190) of the estimator.
The resampling vector is the vector of weights given to the original data points in a resample X_{1}*, . . ., X_{n}*. The weight p_{i}* given to the original data point X_{i} is the fraction of times X_{i} appears in the resample. This is calculated by the statement
p.star < tabulate(k.star, n) / n
in the bootstrap loop, wherek.star
is the by now familiar resample of indices. The analogous vector for the original sample is calculated by the statementp.star < rep(1 / n, n)
We now have to write a function that calculates the estimator given the data
y
andz
the resampling vectorp.star
.Unfortunately, this is, in general, hard.
Fortunately, this is, for moments, quite straightforward.
For any function
g
, any data vectorx
, and any probability vectorp
, the expressionsum(g(x) * p)
p[i]
to the pointx[i]
for eachi
(and probability zero to everywhere else).Thus
sum(x * p)
calculates the meansum((x  a)^2 * p)
calculates the second moment about the pointa
, and so forth. 
The function
rmean(p, x)
calculates the sample mean of the data vectorx
in resampling form.The
stop
commands for various error situations are, of course, not required. If the function call is done properly they don't do anything. But it will save you endless hours of head scratching sometime if you get in the habit of putting error checks in the functions you write.The function
rratio(p, x, y)
calculates the ratio estimator for datax
andy
using thermean
function in the obvious fashion. 
In the bootstrap loop the vector
p.bar
accumulates the sum of thep.star
vectors. After the bootstrap loop terminates, it is divided bynboot
to give the average of thep.star
vectors. 
Ideally, if
nboot
were infinity,p.bar
would be the same asp.hat
. Sincenboot
is considerably less than infinity,p.bar
is different fromp.hat
Since
sd(theta.star)
is based on resamples that yielded thep.bar
vector, it makes sense to subtract offrratio(p.bar, y, z)
rather thantheta.hat
to estimate bias.The logic is that that the Monte Carlo errors in
sd(theta.star)
andrratio(p.bar, y, z)
tend to be in the same direction and cancel to some degree, giving animproved
estimator. 
This method of expressing estimators in
resampling form
is an important bootstrap technique, which will be used again for the ABCbetter
bootstrap confidence interval technique.
More on Resampling Form
Estimators
Comments
The function rmedian
calculates the median of a bootstrap
sample given in resampling form
.
The code is a bit tricky. The statement
n.star < round(p * n)converts
p
back to counts, the round
function
being there to make sure the result is exactly integervalued (not just close).
Then the statement
k.star < rep(1:n, n.star)converts these back to the index values that were counted: each element of the sequence
1:n
is repeated as many times as the corresponding
count in n.star
. The resulting k.star
inside the
function definition is just like the k.star outside the function
definition except for order, which doesn't matter. Then we can use
k.star
to make x.star
in the usual way, and
apply the function that computes the estimator to x.star
in the usual way.
We try it out, and indeed do get the same answers either way.
Clearly, this function has nothing particular to do with medians. Changing the last line lets it calculate any other function.