University of Minnesota, Twin Cities School of Statistics Stat 5601 Rweb
This page still under construction (again). Don't look yet. Or you can look, but be warned it's unfinished.
The theory of the parametric bootstrap is quite similar to that of the nonparametric bootstrap, the only difference is that instead of simulating bootstrap samples that are i. i. d. from the empirical distribution (the nonparametric estimate of the distribution of the data) we simulate bootstrap samples that are i. i. d. from the estimated parametric model.
All the same considerations arise.
theta hatis not the true parameter value
theta. We do not sample from the correct distribution. We should sample from Fθ. We do sample with the same thing with a
haton the θ (which I can't do on a web page).
Simulating from a parametric model is not so easy as simulating from the
empirical distribution. In fact, it can be arbitrarily complicated. So
hard that it is an open research problem
how to do it. For some
parametric models sampling is easy, others not. In general, it bears no
relation to samping from the empirical.
If the observed data are in the vector x
, then
x.star <- sample(x, replace = TRUE)makes a nonparametric bootstrap sample.
In contrast, if the observed data are assumed to be i. i. d. normal, then
x.star <- rnorm(length(x), mean = mean(x), sd = sd(x))makes a parametric bootstrap sample. This does not do the right thing because we should specify
mu
to be the true population mean
and sd
to be the true population standard deviation (but since
we don't know the population values we must use estimates).
For more contrast, if the observed data are assumed to be i. i. d. Cauchy, then
x.star <- rcauchy(length(x), location = median(x), scale = IQR(x) / 2)makes a parametric bootstrap sample. We can't use
mean(x)
and sd(x)
as estimators of location and scale because the
Cauchy distribution doesn't have moments and hence these aren't consistent
estimators (of anything, much less location and scale).
Why median(x)
and IQR(x) / 2
are consistent
(even asymptotically normal) estimators of location and scale would be
more theory than we want to go into here. The only point we wanted to
make is that the three examples look a lot different from each other.
To get to some examples with wading through a tremendous amount of theory, we will stick to one parametric model for which the sampling looks fairly similar to the nonparametric bootstrap. This is the multinomial distribution.
The multinomial distribution is the distribution of categorical measurements
on i. i. d. individuals. The number of individuals in each category make up
the data vector x
and the probabilities of individuals being
in each category make up a probability vector p
(where probability vector
means all(p >= 0)
and
sum(p) == 1
).
Given a probability vector p
of length k
and a sample size n
one creates a multinomial sample with the R statements
c.star <- sample(1:k, n, prob = p, replace = TRUE) x.star <- tabulate(c.star, k)(The first statement creates an i. i. d. sample of category numbers with the specified probabilities. The second counts the number of individuals in each category. So
x.star
is a vector of length k
.)
For example, suppose we observe the multinomial data defined to
be x
in the form below, and we want to test the null hypothesis
that the true category probabilities are all equal (to 1 / 6 because there
are 6 categories). The R function chisq.test
does the usual
chi-square test that uses the large-sample approximation (that the chi-square
test statistic has a chi-square distribution). The remainder of the code
does the parametric bootstrap test.
Actually, since the null hypothesis is completely specified here this is, strictly speaking, a Monte Carlo test rather than a parametric bootstrap. The test is exact.
For another example, suppose we observe the Poisson data defined to
be x
in the form below, and we want to test the null hypothesis
For another example, suppose we observe the contingency table defined to
be x
in the form below, and we want to test the null hypothesis
of independence (that the row category labels and column category labels
are independent random variables).