Statistics 5601 (Geyer, Spring 2006) Examples: Subsampling

Contents

General Instructions

To do each example, just click the "Submit" button. You do not have to type in any R instructions or specify a dataset. That's already done for you.

Overview

The subject of this web page is the subsampling bootstrap, which is the subject of a book by Politis, Romano, and Wolfe.

It is also the subject of a more detailed web page, which we will get to in a few weeks.

Basic Idea

The subsampling bootstrap samples without replacement at a subsample size b that is smaller than the original sample size n. The sampling without replacement has the consequence that the samples are from the true unknown population distribution.

Neither does both things right. Each has its virtues. For the subsampling bootstrap we need b large but small compared to n, so n must be really large.

Rate of Convergence

In order to use the subsampling bootstrap we must know the rate of convergence of the estimator we are using. We assume that if tn is the estimator, θ is the parameter, and n is the sample size, then

nr (tn − θ)
converges in distribution (to some, not necessarily normal, distribution).

We estimate this distribution by the distribution of

br (tb* − tn)

where b is the subsample size and tb* is the subsampling bootstrap estimator.

Often r is 1 ⁄ 2 (the square root law obeyed by most widely used estimators). Sometimes, as in the extreme values example below, it is not.

Stationary Process or IID Sampling

There are two ways to do subsampling.

One is essential for stationary time series and is demonstrated in the time series example below. In this method, the subsamples are all blocks of length b in the time series. There are not many such blocks (nb + 1), but it is necessary to keep the blocks together to keep the dependence in the time series (at least the dependence that is present in blocks of length b).

The other method applies only to IID and is demonstrated in the extreme values example below. In this method, the subsamples are samples without replacement of length b from the original sample. This allows many more samples than the other method and a more accurate bootstrap.

Time Series

Sections 8.5 and 8.6 in Efron and Tibshirani.

External Data Entry

Enter a dataset URL :

Comments

Extreme Values

Section 7.4 in Efron and Tibshirani.

External Data Entry

Enter a dataset URL :

Comments