sub.html

Handout for Stat 5601
Subsampling Bootstrap Confidence Intervals The fundamental idea of the subsampling bootstrap is that

tn(h^n - h)

(1)

converges in distribution to some distribution (any distribution!). Trivially,

t (h^ -h) b b

(2)

converges to the same distribution, since whether we index by n or b is merely a matter of notation. Usually, we write (2) as

* tb(hb - h)

(3)

to distinguish the estimator ^
h

_n for the full data and the estimator

_b^* for a subsample. The basic assumptions of the subsampling bootstrap are

b-- > oo b ----> 0 n tb-- > oo tb-- > 0 tn

(4)

where n is the sample size and b the subsample size. Under these assumptions

tb(^hn - h)

(5)

converges in probability to zero, just because we would need to multiply by

_n rather than

_b to get a nonzero limit and

_b/

_n goes to zero (those who had the theory class may recognize that this was a homework problem). Subtracting (5) from (3) gives

tb(h*b- ^hn)

(6)

which has the same limit as (3) or (1) (those who had the theory course may recognize that this is because of Slutsky’s theorem). To summarize where we have gotten to, the subsampling bootstrap is based on the assumptions (4) and that (1) converges in distribution to something. In which case, it follows from asymptotic theory that (6) converges to the same limiting distribution as does (1).

It does not matter what the limiting distribution is because we will approximate it using the subsampling bootstrap. Suppose the limiting distribution has distribution function F. We don’t know the functional form of F but we can approximate it by the empirical distribution function F_b^* of the bootstrap (sub)samples (6).

We know that for large n

F-1(a/2) < tn(^hn- h) < F -1(1- a/2)

(7)

occurs with probability approximately 1 -

. That’s what convergence in distribution of (1) to the distribution with distribution function F means. F^-1(

/2) is the

/2 quantile of this distribution and F^-1(1 -

/2) is the 1 -

/2 quantile. Thus if Y is a random variable having this distribution and the distribution is continuous, the probability that

-1 -1 F (a/2) < Y < F (1- a/2)

(8)

is 1 -

. Since we are assuming Y and

_n(

_n -

) have approximately the same distribution for large n, (7) has approximately the same probability as (8). Of course, we don’t know F, but F_b^* converges to F, so for large b and n, we have

*-1 ^ *-1 Fb (a/2) < tn(hn- h) < F b (1 - a/2)

(9)

with probability 1 -

. Rearranging (9) to put

in the middle by itself gives

^hn- t-n 1F *b-1(1- a/2) < h < ^hn- t-n1Fb*- 1(a/2)

(10)

which is the way subsampling bootstrap confidence intervals are done.