To do each example, just click the "Submit" button. You do not have to type in any R instructions or specify a dataset. That's already done for you.
The theory of the parametric bootstrap is quite similar to that of the nonparametric bootstrap, the only difference is that instead of simulating bootstrap samples that are IID from the empirical distribution (the nonparametric estimate of the distribution of the data) we simulate bootstrap samples that are IID from the estimated parametric model.
All the same considerations arise.
theta hatis not the true parameter value
theta. We do not sample from the correct distribution. We should sample from Fθ. We do sample with the same thing with a
haton the θ (which I can't do on a web page).
And so forth.
Simulating from a parametric model is not so easy as simulating from the
empirical distribution. In fact, it can be arbitrarily complicated. So
hard that it is an open research problem
how to do it. For some
parametric models sampling is easy, others not. In general, it bears no
relation to samping from the empirical.
If the observed data are in the vector x
, then
x.star <- sample(x, replace = TRUE)
makes a nonparametric bootstrap sample.
In contrast, if the observed data are assumed to be IID normal, then
x.star <- rnorm(length(x), mean = mean(x), sd = sd(x))
makes a parametric bootstrap sample. This does not do the right thing
because we should specify mu
to be the true population mean
and sd
to be the true population standard deviation (but since
we don't know the population values we must use estimates).
For more contrast, if the observed data are assumed to be IID Cauchy, then
x.star <- rcauchy(length(x), location = median(x), scale = IQR(x) / 2)
makes a parametric bootstrap sample. We can't use mean(x)
and sd(x)
as estimators of location and scale because the
Cauchy distribution doesn't have moments and hence these aren't consistent
estimators (of anything, much less location and scale).
Why median(x)
and IQR(x) / 2
are consistent
(even asymptotically normal) estimators of location and scale would be
more theory than we want to go into here. The only point we wanted to
make is that the three examples look a lot different from each other.
mean(x, trim = 0.25)
for the location estimator
(
on-line help). The untrimmed mean is optimal for the
normal distribution (assuming we really believe the data are normal),
but we can use whatever estimator we want and the bootstrap is still
valid.
mad(x)
for the scale estimator
(
on-line help). The standard deviation
is optimal for the normal, but . . . .
mean(x, trim = 0.25)
for the location estimator
(
on-line help). No simple estimator is optimal.
mad(x)
for the scale estimator
(
on-line help). No simple estimator is optimal.
We change the constant
argument to be right for
the Cauchy distribution. The population MAD being given by
qcauchy(0.75)
, which is 1.0.
To get to some examples with wading through a tremendous amount of theory, we will stick to one parametric model for which the sampling looks fairly similar to the nonparametric bootstrap. This is the multinomial distribution.
The multinomial distribution is the distribution of categorical measurements
on IID individuals. The number of individuals in each category make up
the data vector x
and the probabilities of individuals being
in each category make up a probability vector p
(where probability vector
means all(p >= 0)
and
sum(p) == 1
).
Given a probability vector p
of length k
and a sample size n
one creates a multinomial sample with the R statements
c.star <- sample(1:k, n, prob = p, replace = TRUE) x.star <- tabulate(c.star, k)
(The first statement creates an IID sample of category numbers with
the specified probabilities. The second counts the number of individuals
in each category. So x.star
is a vector of length k
.)
For example, suppose we observe the multinomial data defined to
be x
in the form below, and we want to test the null hypothesis
that the true category probabilities are all equal (to 1 / 6 because there
are 6 categories). The R function chisq.test
does the usual
chi-square test that uses the large-sample approximation (that the chi-square
test statistic has a chi-square distribution). The remainder of the code
does the parametric bootstrap test.
Actually, since the null hypothesis
is completely specified here this is, strictly speaking, a Monte Carlo
test rather than a parametric bootstrap. The test is exact.
(We only say we are doing a parametric bootstrap
when the
null distribution of the test statistic depends on the true unknown
parameter value and we are plugging in an estimate for the unknown truth).
For another example, suppose we observe the Poisson data defined to
be x
in the form below, and we want to test the null hypothesis
that the data is Poisson against the alternative hypothesis that is isn't.
The usual way to do this is to truncate the data, making a category
8 or more
for example. The reason for the truncation is that
the usual asymptotics of the chi-square test requires it. But this
introduces needless complication when we are not using the asymptotics
(except that theta hat
is near theta
).
tabulate(1 + x, 1 + max(x))
is that x
contains integers starting at zero
but tabulate
wants integers starting at one.
tstat
function is the likelihood ratio test statistic.
By standard likelihood theory, it is asymptotically equivalent to
the chi-square test statistic, but is not exactly the same for
any finite sample size.
foo[counts == 0] <- 0
is that
when counts[i]
is zero, then so is p1[i]
,
but that makes foo[i]
zero times minus infinity equals
NaN
, which is not right. Since x log(x)
converges to zero as x goes to zero, we want foo[i]
to be zero in that case.
(A goodness of fit
test like this is just like every other hypothesis test except that we
want to accept H0 as opposed to the usual
situation where we want to reject. Our acceptance of H0
doesn't mean that the data actually have the Poisson distribution, only
that they provide no reason to disbelieve they do.)
For another example, suppose we observe the contingency table defined to
be x
in the form below, and we want to test the null hypothesis
of independence (that the row category labels and column category labels
are independent random variables).
For some reason, the R chisq.test
function does this test
for 2-way tables by simulation, but doesn't do the analogous test for
1-way tables (go figure).
The glm
function in R
(
on-line help) does so-called generalized linear models
.
The theory of these is usually covered in the categorical data course
(like Stat 5421), so we won't cover it here.
For those interested, I wrote up some notes about generalized linear models for the other class I am teaching this semester (Stat 5931), but there's no real need to look at it to understand this example.
The response variable in this problem kyphosis
is categorical with values present
or absent
which we model as independent but not identically distributed
Bernoulli random variables
where the so-called mean value parameter vector
is related to the so-called linear predictor vector
by the link function
and the linear predictor vector has the usual form of a mean function in ordinary linear regression
where X is the so-called design matrix or model matrix for the problem (the rows are cases and the columns are predictor variables) and β is a vector of regression coefficients.
Kyphosis is a misalignment of the spine. The data are on 83 laminectomy
(a surgical procedure involving the spine) patients. The predictor variables
are age
and age^2
(that is, a quadratic function
of age), number
of vertebrae involved in the surgery
and start
the vertebra number of the first vertebra involved.
pred <- predict(out2, type = "response")
estimates the mean value parameter vector p.
If type = "response"
it produces the linear predictor η
which is not what we need for the next item.
kyphosis.star <- rbinom(n, 1, pred)
simulates new Bernoulli data with mean value parameter vector p.
That's what the parametric bootstrap requires.
dev.star
and the P-value pev.star
(which can also be thought of as a test statistic with the proviso
that we reject the null for low values of pev
as opposed
to high values of dev
and other usual test statistics).