Long ago we learned how to do regression.
In the regression output is a good deal of what we need to know to do confidence intervals and tests of significance for linear regression.
We redo Example 21.4 in the textbook.
We use the R functions lm
(on-line
help)
and summary.lm
(on-line
help)
Although this example is not about making the scatterplot and adding the regression line, we do that too and get a plot just like Figure 21.1 in the textbook.
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 91.268 8.934 10.216 3.5e-12 *** Crying 1.493 0.487 3.065 0.00411 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 17.5 on 36 degrees of freedom
contains everything we need to make a confidence interval for the slope.
In the calculation in the textbook, they use the t critical value for 30 degrees of freedom because the book's table is incomplete.
We use R to do the right thing.
Rweb:> 1.493 + c(-1,1) * qt(0.975, df = 36) * 0.487 [1] 0.5053182 2.4806818
The critical value was
Rweb:> qt(0.975, df = 36) [1] 2.028094
and the margin of error was
Rweb:> qt(0.975, df = 36) * 0.487 [1] 0.9876818
In practice we would round, reporting either
or
Not much different, but in real life (as opposed to classwork) there is no excuse for using 30 degrees of freedom when 36 degrees of freedom is correct.
The section title agrees with the textbook. This is a test of whether the true population regression function (assuming all the assumptions in the box on p. 564 in the textbook hold) has zero slope.
The R regression printout contains the test statistic and P-value for this test.
We redo Example 21.5 in the textbook.
The necessary regression printout is given in the output above. We merely highlight a different bit.
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 91.268 8.934 10.216 3.5e-12 ***
Crying 1.493 0.487 3.065 0.00411 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The P-value (P = 0.004) agrees with the one given
in the textbook (because the book used technology
on this one).
Interpretation: These data do appear to have a linear relationship. More precisely, a linear relationship is better supported than a constant relationship (a regression function with slope zero).
The test does not say anything whatsoever about the presence or absence of a nonlinear relationship.
This section is really short. Under the assumptions in the box on p. 564 in the textbook, the correlation between the two variables is zero if and only if their population regression function has zero slope.
Therefore the heading of this section and the heading of the preceding section describe exactly the same test.
We redo Examples 21.6 and 21.9 in the textbook.
We use the R functions lm
(on-line
help)
and predict.lm
(on-line
help)
The intervals are
Rweb:> ##### confidence interval for mean response ##### Rweb:> predict(out, newdata = data.frame(Beers = 5), + interval = "confidence") fit lwr upr [1,] 0.0771182 0.06611536 0.08812105 Rweb:> ##### prediction interval for single observation ##### Rweb:> predict(out, newdata = data.frame(Beers = 5), + interval = "prediction") fit lwr upr [1,] 0.0771182 0.03191712 0.1223193
Again, they agree with the textbook because the textbook
used technology
.
We use the data Example 21.4 in the textbook that was also used above.
This time we fit a quadratic regression function. We assume
where the Zi are IID standard normal.
We are most interested here in a test of significance rather than a confidence interval. We wish to test
The null hypothesis here is that the coefficient of the quadratic term is zero, which means the linear regression done above is correct (more precisely, that a quadratic function seems no better).
The entire test is done in the printout.
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 115.86938 25.12217 4.612 5.15e-05 ***
Crying -1.24247 2.65611 -0.468 0.643
I(Crying^2) 0.06828 0.06518 1.048 0.302
The quadratic term is not statistically significant (P = 0.302, two-tailed t test).
Note that adding another term (here Crying2
)
is easy when you use the computer. The hand calculations that would
be involved in this example are so horrendous that no book written
since the dawn of the computer age presents them.