General Instructions
To do each example, just click the Submit
button.
You do not have to type in any R instructions or specify a dataset.
That's already done for you.
Theory
The parametric bootstrap, like the name says, simulates from the parametric
model. We say bootstrap
rather than simulation
because the
former term recognizes that we are doing the wrong thing (simulating
under the distribution indicated by our parameter estimate rather than
under the true unknown distribution).
Like the nonparametric bootstrap, there are presumably many different ways to construct parametric bootstrap confidence intervals, although the literature on the subject is thin (the nonparametric bootstrap gets most of the literature).
But we just illustrate one method: the parametric analog of nonparametric bootstrap t confidence intervals.
Practice
We are going to do a logistic regression example.
The response variable in this problem kyphosis
is categorical with values present
or absent
which we model as independent but not identically distributed
Bernoulli random variables.
Kyphosis is a misalignment of the spine. The data are on 83 laminectomy
(a surgical procedure involving the spine) patients. The predictor variables
are age
and age^2
(that is, a quadratic function
of age), number
of vertebrae involved in the surgery
and start
the vertebra number of the first vertebra involved.
The response is presence or absence of kyphosis after the surgery
(and perhaps caused by it).
Comments
- The command
pred <- predict(out2, type = "response")
estimates the mean value parameter vector. - The command
kyphosis.star <- rbinom(n, 1, pred)
simulates new Bernoulli data with mean value parameter vector. That's what the parametric bootstrap requires. - We save both the bootstrap values of the estimator and its standard error.
-
Then we make the bootstrap analog of the asymptotically pivotal (asymptotically
standard normal) quantity
(theta.hat - theta) / se.theta.hat
. -
The warnings show that we are actually far from asymptopia. In some bootstrap
samples we have MLE
at infinity
(perhaps), and in any case are in trouble with the asymptotics. But we do not throw these out of the bootstrap distribution. They reflect the actual performance of logistic regression for these data and this model. - The histogram shows the sampling distribution of this asymptotically pivotal quantity is not only not standard normal, it is both skewed and biased.
- It just turns out that the bootstrap critical values do not reflect the skewness, but this is only because of the confidence level we chose. They would be very different from plus or minus the same quantity if we chose a 99% confidence level.