Chapter 47 Covariates

47.1 Thread data

These are data from Montgomery. We want to assess the strength of threads made by three different machines. Each thread is made from a batch of cotton, and some batches tend to form thicker thread than other batches. There’s no way to know how thick it will be until you make it. Regardless of how the machines may affect thread strength, thicker threads are stronger. Thus we record diameter as well as strength, with diameter as a covariate.

> library(cfcdae)
> library(car)
> strength<-c(36,41,39,42,49,40,48,39,45,44,35,37,42,34,32)
> diameter<-c(20,25,24,25,32,22,28,22,30,28,21,23,26,21,15)
> machine<-factor(rep(1:3,each=5))

47.2 Comparing machines

From the plot, it is fairly clear here that thicker threads are stronger. It is not so clear whether one machine is stronger than another; machine 3 seems a little low, machine 2 may be a bit high.

> plot(diameter,strength,pch=as.character(machine))
Plot of strength against diameter, labeled by machine

Figure 47.1: Plot of strength against diameter, labeled by machine

Igonoring covariates, machine 2 seems to have more strength, and machine 3 seems low. Means verify this.

> boxplot(strength ~ machine)
Boxplots of strength by machine

Figure 47.2: Boxplots of strength by machine

> tapply(strength,machine,mean)
   1    2    3 
41.4 43.2 36.0 

But machine 2 also seems to have larger diameters, and machine 3 smaller diameters. Means verify this.

> boxplot(diameter ~ machine)
Boxplots of diameter by machine

Figure 47.3: Boxplots of diameter by machine

> tapply(diameter,machine,mean)
   1    2    3 
25.2 26.0 21.2 

47.3 Basic Analysis of Covariance

Let’s begin with the basic, standard Analysis of Covariance. Then we will go back and situate that in a larger context.

ANCOVA asks whether adding treatment effects improves a model that already includes the covariate. This is treatments after covariates.

> cov.then.trt <- lm(strength~diameter+machine)
> anova(cov.then.trt)
Analysis of Variance Table

Response: strength
          Df  Sum Sq Mean Sq  F value   Pr(>F)    
diameter   1 305.130 305.130 119.9330 2.96e-07 ***
machine    2  13.284   6.642   2.6106   0.1181    
Residuals 11  27.986   2.544                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There is little evidence that machine affects strength after accounting for diameter effects. This is the analysis of covariance.

However, we really need to check assumptions. The residual plot looks a bit like the flopping fish.

> plot(cov.then.trt,which=1)
Residual plot for strength data

Figure 47.4: Residual plot for strength data

Box-Cox suggests a reciprocal (reciprocal of strength is weakness?), but the power 1 is just within the interval.

> boxCox(cov.then.trt)
Box-Cox plot for strength data

Figure 47.5: Box-Cox plot for strength data

Residuals on the reciprocal scale do look a little better, but not a lot better. To keep things simple, I’ll continue on the original scale, but further exploration of a transformation might be warranted. (Although I will point out that the range of strength is from 32 to 49, a factor of 1.5; typical power transformations are not going to be able to do much over that range.)

To summarize, after accounting for the covariate, there is little evidence that the treatment makes any difference.

47.4 Variance Reduction

What if we had not used our covariate? Then we would have just run a model using only machine. The MSE in the model with the covariate is 2.54, but the MSE in the model without the covariate is 17.2. That is a huge amount of variance reduction that we modeled away using the covariate.

> mach.only <- lm(strength~machine)
> anova(mach.only)
Analysis of Variance Table

Response: strength
          Df Sum Sq Mean Sq F value  Pr(>F)  
machine    2  140.4  70.200  4.0893 0.04423 *
Residuals 12  206.0  17.167                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> anova(cov.then.trt)
Analysis of Variance Table

Response: strength
          Df  Sum Sq Mean Sq  F value   Pr(>F)    
diameter   1 305.130 305.130 119.9330 2.96e-07 ***
machine    2  13.284   6.642   2.6106   0.1181    
Residuals 11  27.986   2.544                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In the same way, the standard errors from the treatment effects in the covariate model are smaller by a factor of 2.5.

> summary(mach.only)

Call:
lm(formula = strength ~ machine)

Residuals:
   Min     1Q Median     3Q    Max 
  -5.4   -2.8   -0.4    1.4    7.6 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   40.200      1.070  37.578 8.11e-14 ***
machine1       1.200      1.513   0.793   0.4431    
machine2       3.000      1.513   1.983   0.0707 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.143 on 12 degrees of freedom
Multiple R-squared:  0.4053,    Adjusted R-squared:  0.3062 
F-statistic: 4.089 on 2 and 12 DF,  p-value: 0.04423
> summary(cov.then.trt)

Call:
lm(formula = strength ~ diameter + machine)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.0160 -0.9586 -0.3841  0.9518  2.8920 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  17.1771     2.7830   6.172 6.99e-05 ***
diameter      0.9540     0.1140   8.365 4.26e-06 ***
machine1      0.1824     0.5950   0.307    0.765    
machine2      1.2192     0.6201   1.966    0.075 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.595 on 11 degrees of freedom
Multiple R-squared:  0.9192,    Adjusted R-squared:  0.8972 
F-statistic: 41.72 on 3 and 11 DF,  p-value: 2.665e-06

47.5 Covariate adjustment

Without the covariate, the estimated difference between machines 1 and 2 is 1.8; with the covariate the estimated difference is 1.04.

> linear.contrast(mach.only,machine,c(1,-1,0))
  estimates       se    t-value   p-value  lower-ci upper-ci
1      -1.8 2.620433 -0.6869095 0.5051981 -7.509432 3.909432
attr(,"class")
[1] "linear.contrast"
> linear.contrast(cov.then.trt,machine,c(1,-1,0))
  estimates       se   t-value  p-value  lower-ci upper-ci
1  -1.03681 1.012913 -1.023592 0.328012 -3.266217 1.192597
attr(,"class")
[1] "linear.contrast"

Make a data frame for prediction. We will predict in all three levels of machine; for the first three predictions we use a common value of the covariate. For the last three cases we use the means of the covariates for each treatment.

> tapply(diameter,machine,mean)
   1    2    3 
25.2 26.0 21.2 
> pred.frame <- data.frame(machine=factor(c(1,2,3,1,2,3)),diameter=c(25,25,25,25.2,26,21.2))

If we predict using the model that ignores the covariate, then the covariate has no effect on the prediction (duh!), and the predictions are the means by each treatment. Treatments 1 and 2 are predicted 1.8 units apart.

> predict(mach.only,pred.frame)
   1    2    3    4    5    6 
41.4 43.2 36.0 41.4 43.2 36.0 
> tapply(strength,machine,mean)
   1    2    3 
41.4 43.2 36.0 

On the other hand, use the model that includes the covariate. When you predict at the mean of each covariate, then the predictions are just the treatment means. But if you predict at a common value of the covariate, then the predictions differ by an amount in agreement with the coefficients in the covariate model.

> predict(cov.then.trt,pred.frame)
       1        2        3        4        5        6 
41.20920 42.24601 39.62515 41.40000 43.20000 36.00000 

You can see the effect in this plot. The black lines are the covariate model fits. The vertical distance between the black lines represents the treatment effect in the covariate model. These differences are adjusted back to the same covariate.

The red circles and horizontal lines are what you get if you predict at each treatment’s average covariate. Then you get the unadjusted means in the vertical.

> plot(diameter,strength,pch=as.character(machine))
> cfs <- coefficients(cov.then.trt)
> abline(cfs[1]+cfs[3],cfs[2])
> abline(cfs[1]+cfs[4],cfs[2])
> abline(cfs[1]-cfs[3]-cfs[4],cfs[2])
> points(25.2,41.4,col="red")
> points(26,43.2,col="red")
> points(21.2,36,col="red")
> abline(h=36,col="red")
> abline(h=43.2,col="red")
> abline(h=41.4,col="red")
Strength data with model fits

Figure 47.6: Strength data with model fits

47.6 What if treatments affect the covariates?

Ordinary ANCOVA does variance reduction and compares treatments at a common level of the covariate. What if the treatment changes the covariate? Then we might not want to compare at a common values of the covariate.

Some of the variability in the data can be explained by the treatment, some by the covariate, and some by either. Ordinary ANCOVA attributes the “either” portion to the covariate. We can change this so the “either” portion gets attributed to the treatment.

Adjusted covariates are the residuals from modeling the the covariate by the treatment. If you use adjusted covariates, then the “either” portion of the variation goes to the treatment.

Begin by getting the adjusted covariates, then fit the model using the adjusted covariate.

> diam.on.mach <- lm(diameter ~ machine)
> diam.adj <- residuals(diam.on.mach)
> cov.adj.trt <- lm(strength~diam.adj+machine)

The SS explained by this model is the same as when using the original covariate, but the SS for the treatment are what you would get for treatment if it were first in a Type 1 ANOVA.

In these data, using adjusted covariates leads to a highly significant treatment effect.

> anova(cov.adj.trt)
Analysis of Variance Table

Response: strength
          Df  Sum Sq Mean Sq F value    Pr(>F)    
diam.adj   1 178.014 178.014  69.969 4.264e-06 ***
machine    2 140.400  70.200  27.593 5.170e-05 ***
Residuals 11  27.986   2.544                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> anova(lm(strength~machine+diameter))
Analysis of Variance Table

Response: strength
          Df  Sum Sq Mean Sq F value    Pr(>F)    
machine    2 140.400  70.200  27.593 5.170e-05 ***
diameter   1 178.014 178.014  69.969 4.264e-06 ***
Residuals 11  27.986   2.544                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The treatments coefficients are different, but the slope is the same.

> coefficients(cov.adj.trt)
(Intercept)    diam.adj    machine1    machine2 
 40.2000000   0.9539877   1.2000000   3.0000000 
> coefficients(cov.then.trt)
(Intercept)    diameter    machine1    machine2 
 17.1770961   0.9539877   0.1824131   1.2192229 

Now we plot using the adjusted covariates. The differences between the lines are the same as the differences between the raw treatment means.

> plot(diam.adj,strength,pch=as.character(machine))
> cfs <- coefficients(cov.adj.trt)
> abline(cfs[1]+cfs[3],cfs[2])
> abline(cfs[1]+cfs[4],cfs[2])
> abline(cfs[1]-cfs[3]-cfs[4],cfs[2])
> points(0,41.4,col="red")
> points(0,43.2,col="red")
> points(0,36,col="red")
> abline(h=36,col="red")
> abline(h=43.2,col="red")
> abline(h=41.4,col="red")
Strength data with adjusted covariate model fits

Figure 47.7: Strength data with adjusted covariate model fits

Moral: If you use the original covariate, then ANCOVA gives you covariate adjusted treatment means.
If you use the adjusted covariate, then ANCOVA gives you the unadjusted treatment means.

How do you tell if treatments affect covariates? Just look at the ANOVA for the model you used to get adjusted covariates.

Here there is no evidence that treatments affect covariates.

> anova(diam.on.mach)
Analysis of Variance Table

Response: diameter
          Df  Sum Sq Mean Sq F value Pr(>F)
machine    2  66.133  33.067  2.0286 0.1742
Residuals 12 195.600  16.300               

47.7 Other covariate models

The standard covariate model we have been looking at is the “parallel lines” model, also called the “separate intercepts” model. The treatments have an additive effect, but the slope on the covariate is the same in all treatments.

Less common covariate models include:
Separate slopes In this model, there is a common intercept, but each treatment group has its own slope. This model is highly dependent on where 0 is on the covariate.
Separate lines In this model, each treatment has its own slope and its own intercept.

You can compare single line, to parallel lines, to separate lines via ANOVA, because the models form a nested set.

You can compare single line, to separate slopes, to separate lines via ANOVA, because the models form a nested set.

Parallel lines and separate slopes cannot be compared using ANOVA (not nested), but you can compare via AIC.

Fit the models. Note that we also fit a separate slopes model where we shift the diameter by 25.

> single.line <- lm(strength~diameter)
> parallel.lines <- lm(strength~diameter+machine)
> separate.lines <- lm(strength~diameter*machine)
> separate.slopes <- lm(strength~1+diameter:machine)
> diameter.m25 <- lm(strength~1+I(diameter-25):machine)

The first comparison here is the normal analysis of covariance. The SS matches what we saw before, but the p-value printed is different, because it is using the MSE from the separate lines model and not the parallel lines model.

> anova(single.line,parallel.lines,separate.lines)
Analysis of Variance Table

Model 1: strength ~ diameter
Model 2: strength ~ diameter + machine
Model 3: strength ~ diameter * machine
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1     13 41.270                           
2     11 27.986  2   13.2839 2.3675 0.1492
3      9 25.249  2    2.7372 0.4878 0.6293

Here is the comparison down the separate slopes side of the chain.

> anova(single.line,separate.slopes,separate.lines)
Analysis of Variance Table

Model 1: strength ~ diameter
Model 2: strength ~ 1 + diameter:machine
Model 3: strength ~ diameter * machine
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1     13 41.270                           
2     11 27.913  2   13.3569 2.3806 0.1480
3      9 25.249  2    2.6642 0.4748 0.6367

Finally, note that changing the zero on the covariate changes the fit of the separate slopes model. This is not true from any of the other models.

> anova(separate.slopes)
Analysis of Variance Table

Response: strength
                 Df Sum Sq Mean Sq F value    Pr(>F)    
diameter:machine  3 318.49 106.162  41.837 2.627e-06 ***
Residuals        11  27.91   2.538                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> anova(diameter.m25)
Analysis of Variance Table

Response: strength
                         Df  Sum Sq Mean Sq F value    Pr(>F)    
I(diameter - 25):machine  3 306.503 102.168  28.168 1.844e-05 ***
Residuals                11  39.897   3.627                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1