Fitting a Generalized Linear Model
Try 1
We start with simple
generalized linear model where the
linear predictor has the form
The statement containing the curve
function is a bit tricky
but the R idiom for doing this. More on this below.
Try 2
Suppose we had a suspicion of periodicity and think being linear on
the linear predictor scale — which we see is curvilinear on
the mean value parameter scale — and try a model with period
the length of x
Try 3
Better, but perhaps a model with period
half the length of x
Hypothesis Tests
Tests about Regression Coefficients
In the output for the preceding example
Estimate Std. Error z value Pr(>|z|) (Intercept) 1.65926 0.05135 32.314 < 2e-16 *** I(sin(w)) 0.99526 0.07996 12.448 < 2e-16 *** I(cos(w)) -0.05856 0.05530 -1.059 0.289671 I(sin(2 * w)) 0.21760 0.05999 3.627 0.000287 *** I(cos(2 * w)) 0.54401 0.06331 8.593 < 2e-16 ***the highlighted numbers are P-values for five tests of the hypotheses that one of the true coefficients β1, …, β5 are zero.
To actually take all five of these P-values seriously is multiple testing without correction and makes no sense. (To take one seriously — one that was chosen before the data were seen — is valid.)
Model Comparison Tests
In this section we consider testing a big model
versus a little model
. The null hypothesis is some regression
model, and the alternative hypothesis is some other regression model,
and the little model is a submodel of the big model
(the little model is obtained by setting some of the regression coefficients
of the big model to zero).
As an example, let us consider the periodic models fit above.
We can compare as many nested models as we want so long as the
models form a nested sequence. Here we have three models,
the smallest one with formula y ~ 1
fits a constant
regression function. The middle one is the period 1 model
fit above, and the largest one is the period 2 model
fit above.
The interpretation of the tests is that both null hypotheses are rejected. The smallest model fits (much) worse than the middle model and the middle model fits (much) worse than the big model. Conclusion: only the big model fits (perhaps — maybe an even bigger model is needed).
Note that you can't get the same effect by looking at the five P-values for the five regression coefficients of the big model separately. That violates the do only one test rule.
If you want to make this model comparison, you need to do this test.
Confidence Intervals
Confidence Intervals for Regression Coefficients
In the output for the period 2 example
Estimate Std. Error z value Pr(>|z|) (Intercept) 1.65926 0.05135 32.314 < 2e-16 *** I(sin(w)) 0.99526 0.07996 12.448 < 2e-16 *** I(cos(w)) -0.05856 0.05530 -1.059 0.289671 I(sin(2 * w)) 0.21760 0.05999 3.627 0.000287 *** I(cos(2 * w)) 0.54401 0.06331 8.593 < 2e-16 ***the highlighted numbers are The printout gives us three numbers (highlighted in yellow) needed to construct the confidence interval for one regression coefficient.
- The first number is the estimate, also called the sample regression coefficient.
- The second number is the standard error of this estimate (the estimated standard deviation of the sampling distribution of the estimate).
Thus one makes up a confidence interval in the usual way
where the critical value here comes from the standard normal distribution. (Unlike LM there is no variance estimate and no denominator degrees of freedom. Instead of t and F we have z and chi-square tests. Also these tests are only approximate rather than exact.)
If we want a 95% confidence interval, the critical value is given by
Rweb:> qnorm(0.975) [1] 1.959964
and then the confidence interval for the parameter β5 for these data is given by
Rweb:> 0.54401 + c(-1, 1) * qnorm(0.975) * 0.06331 [1] 0.4199247 0.6680953
Confidence Intervals for the Regression Function
We can also get a confidence interval for the value of the regression function μ = g(x) at some predictor point x = (x1, ..., xk).
The R function predict
does this.
For a first example, take the period 2 example we used frequently in this web page. Suppose we want to estimate g(w) for w = 4.5.
If the predictor is a vector, then each component must be supplied
in the newdata
argument.