Make a QQ plot of a supposedly chi-square random variable against
the theoretical chi-square distribution. Try to make it look as
much like that produced by qqnorm
as possible.
R doesn't have a function to do this. Look at the code for qqnorm
to see what it does. The R statements
qqnorm methods(qqnorm) qqnorm.default
first see the definition of qqnorm
, when this turns out to
be uninformative, just UseMethod("qqnorm")
, we check to
see what methods qqnorm
has, and then look at the definition
of the only one it has (why use this methods stuff when there is only one
method is a mystery that only the author of the function could elucidate).
As someone brought up in class, there doesn't seem to be much point to
following your QQ plot with qqline(foo)
. What you want
to plot is the line of identity (the graph of the identity function
f(x) = x). So add that line to your
plot.
For test data you can use truely chi-square data produced by
rchisq
.
Write an R function that takes an arbitrary continuous distribution
having a q
function (like qnorm
or qchisq
) as an argument as well as some data y
and produces a plot like that you made in first problem.
Be sure to include tests for bogus arguments. Again the R source for
qqnorm
is a good example.
Do a simulation to see whether the deviance
statistic for
a particular logistic regression is well approximated by a chi-square
distribution. For predictor variables let us use one quantitative
and one categorical variable created as follows
n <- 30 set.seed(111) x <- rnorm(n) w <- sample(c("red", "green", "blue"), n, replace = TRUE) w <- as.factor(w)
and one Bernoulli response created as follows (with the random number
generator in the state it was in just after creating x
and
w
)
betax <- 1 betaw <- c(0.75, -1.5, 0.33333) names(betaw) <- c("red", "green", "blue") invlogit <- function(x) 1 / (1 + exp(x)) eta <- betax * x + betaw[as.character(w)] p <- as.numeric(invlogit(eta)) y <- rbinom(n, 1, p)
standardhere meaning to use the logit link function) one, the
big model, having both
x
and w
as predictors, and the other,
the little model, having just
x
. Print out the
model summaries.
betax
and betaw
,
but pretend we don't.)
parametric bootstrapsimulation of the null distribution of the test statistic. This involves
p.hat.null
the predicted sucess probabilities under
for the little model. (Hint: help(predict.glm)
.)
nboot
times
yboot
just like we generated y
but using the mean value parameter p.hat.null
instead of
p
.
yboot
for the
data instead of y
. Store this deviance, say as
dev[iboot]
.
dev
) with its
large-sample chi-square approximation using the QQ plot ideas
you developed in preceding problems.
mean(dev >= dev.obs)
if dev.obs
is the
value of the test statistic corresponding to the observed data
y
. Compare this with the P-value derived from the
large-sample approximation.