Fitting a Generalized Linear Model

Try 1

We start with simple generalized linear model where the linear predictor has the form

η = β1 + β2 x
and the response is assumed Poisson

The statement containing the curve function is a bit tricky but the R idiom for doing this. More on this below.

Try 2

Suppose we had a suspicion of periodicity and think being linear on the linear predictor scale — which we see is curvilinear on the mean value parameter scale — and try a model with period the length of x

η = β1 + β2 sin(ω) + β3 cos(ω)
where ω = x / length(x) * 2 π

Try 3

Better, but perhaps a model with period half the length of x

η = β1 + β2 sin(ω) + β3 cos(ω) + β4 sin(2 ω) + β5 cos(2 ω)

Hypothesis Tests

Tests about Regression Coefficients

In the output for the preceding example

              Estimate Std. Error z value Pr(>|z|)     
(Intercept)    1.65926    0.05135  32.314  < 2e-16 *** 
I(sin(w))      0.99526    0.07996  12.448  < 2e-16 *** 
I(cos(w))     -0.05856    0.05530  -1.059 0.289671     
I(sin(2 * w))  0.21760    0.05999   3.627 0.000287 *** 
I(cos(2 * w))  0.54401    0.06331   8.593  < 2e-16 *** 
the highlighted numbers are P-values for five tests of the hypotheses that one of the true coefficients β1, …, β5 are zero.

To actually take all five of these P-values seriously is multiple testing without correction and makes no sense. (To take one seriously — one that was chosen before the data were seen — is valid.)

Model Comparison Tests

In this section we consider testing a big model versus a little model. The null hypothesis is some regression model, and the alternative hypothesis is some other regression model, and the little model is a submodel of the big model (the little model is obtained by setting some of the regression coefficients of the big model to zero).

As an example, let us consider the periodic models fit above.

We can compare as many nested models as we want so long as the models form a nested sequence. Here we have three models, the smallest one with formula y ~ 1 fits a constant regression function. The middle one is the period 1 model fit above, and the largest one is the period 2 model fit above.

The interpretation of the tests is that both null hypotheses are rejected. The smallest model fits (much) worse than the middle model and the middle model fits (much) worse than the big model. Conclusion: only the big model fits (perhaps — maybe an even bigger model is needed).

Note that you can't get the same effect by looking at the five P-values for the five regression coefficients of the big model separately. That violates the do only one test rule.

If you want to make this model comparison, you need to do this test.

Confidence Intervals

Confidence Intervals for Regression Coefficients

In the output for the period 2 example

              Estimate Std. Error z value Pr(>|z|)     
(Intercept)    1.65926    0.05135  32.314  < 2e-16 *** 
I(sin(w))      0.99526    0.07996  12.448  < 2e-16 *** 
I(cos(w))     -0.05856    0.05530  -1.059 0.289671     
I(sin(2 * w))  0.21760    0.05999   3.627 0.000287 *** 
I(cos(2 * w))  0.54401    0.06331   8.593  < 2e-16 *** 
the highlighted numbers are The printout gives us three numbers (highlighted in yellow) needed to construct the confidence interval for one regression coefficient.

Thus one makes up a confidence interval in the usual way

point estimate ± critical value × standard error

where the critical value here comes from the standard normal distribution. (Unlike LM there is no variance estimate and no denominator degrees of freedom. Instead of t and F we have z and chi-square tests. Also these tests are only approximate rather than exact.)

If we want a 95% confidence interval, the critical value is given by

Rweb:> qnorm(0.975) 
[1] 1.959964

and then the confidence interval for the parameter β5 for these data is given by

Rweb:> 0.54401 + c(-1, 1) * qnorm(0.975) * 0.06331
[1] 0.4199247 0.6680953

Confidence Intervals for the Regression Function

We can also get a confidence interval for the value of the regression function μ = g(x) at some predictor point x = (x1, ..., xk).

The R function predict does this.

For a first example, take the period 2 example we used frequently in this web page. Suppose we want to estimate g(w) for w = 4.5.

If the predictor is a vector, then each component must be supplied in the newdata argument.