Chapter 34 Factorial Topics
34.1 Hierarchy
Parameters are difficult to understand in non-hierarchical models, so I almost universally recommend hierarchical models. R helps you with that as well (assuming you think of it as help): when you put a term in a model before some of its subterms, R will have the term swallow up the subterms as well, enforcing hierarchy.
34.1.1 Barley sprouting
Let’s look at our old friend the barley sprouting data and try a couple of models.
- If you put the
weeks:water
interaction in the model afterwater
but beforeweeks
, then the SS and df forweeks:water
are the sums of those forweeks
andweeks:water
. - Note that the term that R fits is actually
water:weeks
instead ofweeks:water
. R names interactions with the elements in the order they first appeared in the model. - In the enforced hierarchy situation, not only do the SS and
df add up, but the effects for the interaction are the sum
of the effects for
weeks
andwater:weeks
.
> library(cfcdae)
> data(SproutingBarley)
> standard.model <- lm(sprouting~weeks*water,data=SproutingBarley)
> anova(standard.model)
Analysis of Variance Table
Response: sprouting
Df Sum Sq Mean Sq F value Pr(>F)
weeks 4 1321.13 330.28 5.5293 0.003645 **
water 1 1178.13 1178.13 19.7232 0.000251 ***
weeks:water 4 208.87 52.22 0.8742 0.496726
Residuals 20 1194.67 59.73
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> model.effects(standard.model,"weeks")
1 3 6 9 12
-7.5333333 -5.7000000 0.1333333 1.8000000 11.3000000
> model.effects(standard.model,"weeks:water")
4 8
1 -4.266667 4.266667
3 -1.433333 1.433333
6 0.400000 -0.400000
9 3.066667 -3.066667
12 2.233333 -2.233333
> skipped.model <- lm(sprouting~water+weeks:water,data=SproutingBarley)
> anova(skipped.model)
Analysis of Variance Table
Response: sprouting
Df Sum Sq Mean Sq F value Pr(>F)
water 1 1178.1 1178.13 19.7232 0.000251 ***
water:weeks 8 1530.0 191.25 3.2017 0.016522 *
Residuals 20 1194.7 59.73
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> model.effects(skipped.model,"water:weeks")
1 3 6 9 12
4 -11.800000 -7.133333 0.5333333 4.866667 13.533333
8 -3.266667 -4.266667 -0.2666667 -1.266667 9.066667
34.1.2 Amylase activity
Now we look at the amylase activity data. There are three factors: variety of corn, growth temperature, and analysis temperature. Read in the data; fit the preliminary model; check residuals. We see increasing variance.
> data(AmylaseActivity)
> amylase.fit <- lm(amylase ~ aTemp*gTemp*variety,data=AmylaseActivity)
> plot(amylase.fit,which=1)

Figure 34.1: Residual plot for amylase activity
Box-Cox suggests a log, but power 1 (leave the data alone) is surprisingly close to the interval.
> car::boxCox(amylase.fit)

Figure 34.2: Box-Cox plot for amylase activity
Fit the model with the transformed data and do the anova.
The highly significant terms are aTemp
, variety
, and
gTemp:variety
. Note that the main effect gTemp
is
not (even close to) significant, but gTemp
interacts
with variety
. We still keep gTemp
in the model.
With a weighting other than the standard equal weighting,
gTemp
might be significant.
(If you look at residuals for the logged data, constant variance is better, but not perfect, and the residuals are a bit short tailed.)
> amylase.fitlog <- lm(log(amylase)~aTemp*gTemp*variety,data=AmylaseActivity)
> anova(amylase.fitlog)
Analysis of Variance Table
Response: log(amylase)
Df Sum Sq Mean Sq F value Pr(>F)
aTemp 7 3.01613 0.43088 78.8628 < 2.2e-16 ***
gTemp 1 0.00438 0.00438 0.8016 0.3739757
variety 1 0.58957 0.58957 107.9085 2.305e-15 ***
aTemp:gTemp 7 0.08106 0.01158 2.1195 0.0539203 .
aTemp:variety 7 0.02758 0.00394 0.7212 0.6543993
gTemp:variety 1 0.08599 0.08599 15.7392 0.0001863 ***
aTemp:gTemp:variety 7 0.04764 0.00681 1.2457 0.2916176
Residuals 64 0.34967 0.00546
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
So what is going on? Look at the interaction plot. For variety B73, the response increases in growth temperature. For variety Oh43, the response decreases in growth temperature, by an almost equal amount. So simply averaging over the two, it does not appear that growth temperature has a main effect.
If variety B73 were five times as important (perhaps it is five times as prevalant in Minnesota), then you might want to weight B73 five times Oh43. With that weighting, the response would increase with growth temperature.
> interactplot(gTemp,variety,log(amylase),data=AmylaseActivity,
+ conf=.95,sigma2=.00546,df=64)

Figure 34.3: Interaction plot gTemp:variety for amylase activity
The growth temperature by analysis temperature interaction is marginally significant. If we look at it we find the activity rises more slowly with analysis temperature but drops more quickly when the growth temperature is low.
> interactplot(aTemp,gTemp,log(amylase),data=AmylaseActivity,
+ sigma2=.00546,df=64,conf=.95,jitter=.2)

Figure 34.4: Interactin plot for aTemp:gTemp for amylase data
34.1.3 What happens with unequal weights?
This next bit is heavy going; it is OK to just remember
- Parameterizations are abritrary.
- Parameters for terms that are included in other terms in the model can be affected by weightings of other factors.
- Stick with hierarchical models.
and just skip to the next section.
Still here? OK, now we’ll look at the data in Table 8.12.
We can see that the estimated effects for b
are zero,
but these are zero in the presence of the interaction.
(n=1 so
no df for error so no SE or confidence intervals.)
> library(cfcdae)
> y <- c(120,144,192,168,168,120)
> a <- factor(c(1,2,3,1,2,3))
> b <- factor(c(1,1,1,2,2,2))
> fit.standard <- lm(y~a*b)
> summary(fit.standard)
Call:
lm(formula = y ~ a * b)
Residuals:
ALL 6 residuals are 0: no residual degrees of freedom!
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.52e+02 NaN NaN NaN
a1 -8.00e+00 NaN NaN NaN
a2 4.00e+00 NaN NaN NaN
b1 -3.78e-15 NaN NaN NaN
a1:b1 -2.40e+01 NaN NaN NaN
a2:b1 -1.20e+01 NaN NaN NaN
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 5 and 0 DF, p-value: NA
Don’t try this at home, kids, but now we’ll force R to weight the rows (1,2,1) instead of the usual (1,1,1). Ie, for row effects, the first, plus twice the second, plus the third add to zero.
The first thing we have to do is construct what R calls a “contrast” function.
> wts <- c(1,2,1)
> contr.weights.121 <-
+ function (n, contrasts = TRUE, sparse = FALSE)
+ {
+ if (length(n) <= 1L) {
+ if (is.numeric(n) && length(n) == 1L && n > 1L)
+ levels <- seq_len(n)
+ else stop("not enough degrees of freedom to define contrasts")
+ }
+ else levels <- n
+ levels <- as.character(levels)
+ nlevels <- length(levels)
+ if(!is.numeric(wts) || length(wts) != nlevels || min(wts) <= 0)
+ stop("wts must be vector of positive weights for each level of factor")
+
+ cont <- diag(rep(1,nlevels))
+
+ if (contrasts) {
+ cont <- cont[, -nlevels, drop = FALSE]
+ colnames(cont) <- NULL
+ }
+ cont[length(levels),] <- -wts[1:(nlevels-1)]/wts[nlevels]
+ cont
+ }
Honestly, it took me some time to figure out how to do
that; in the end it’s a modification of the contr.sum()
contrasts
that we normally use.
Now refit, forcing the rows to have those weights. Here, factor B has non-zero coefficients.
> fit.wts <- lm(y~a*b,contrasts=c(a=contr.weights.121,b=contr.sum))
> summary(fit.wts)
Call:
lm(formula = y ~ a * b, contrasts = c(a = contr.weights.121,
b = contr.sum))
Residuals:
ALL 6 residuals are 0: no residual degrees of freedom!
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 153 NaN NaN NaN
a1 -9 NaN NaN NaN
a2 3 NaN NaN NaN
b1 -3 NaN NaN NaN
a1:b1 -21 NaN NaN NaN
a2:b1 -9 NaN NaN NaN
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 5 and 0 DF, p-value: NA
But both models have the same fitted values.
> library(emmeans)
> emmeans(fit.standard,~a:b)
Warning in qt((1 - level)/adiv, df): NaNs produced
a b emmean SE df lower.CL upper.CL
1 1 120 NaN 0 NaN NaN
2 1 144 NaN 0 NaN NaN
3 1 192 NaN 0 NaN NaN
1 2 168 NaN 0 NaN NaN
2 2 168 NaN 0 NaN NaN
3 2 120 NaN 0 NaN NaN
Confidence level used: 0.95
> emmeans(fit.wts,~a:b)
Warning in qt((1 - level)/adiv, df): NaNs produced
a b emmean SE df lower.CL upper.CL
1 1 120 NaN 0 NaN NaN
2 1 144 NaN 0 NaN NaN
3 1 192 NaN 0 NaN NaN
1 2 168 NaN 0 NaN NaN
2 2 168 NaN 0 NaN NaN
3 2 120 NaN 0 NaN NaN
Confidence level used: 0.95
If we don’t include the interaction term,
then the coefficients for b
are the same regardless
of how a
is weighted.
> summary(lm(y~a+b))
Call:
lm(formula = y ~ a + b)
Residuals:
1 2 3 4 5 6
-24 -12 36 24 12 -36
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.520e+02 1.833e+01 8.292 0.0142 *
a1 -8.000e+00 2.592e+01 -0.309 0.7868
a2 4.000e+00 2.592e+01 0.154 0.8915
b1 -6.020e-15 1.833e+01 0.000 1.0000
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 44.9 on 2 degrees of freedom
Multiple R-squared: 0.04545, Adjusted R-squared: -1.386
F-statistic: 0.03175 on 3 and 2 DF, p-value: 0.9903
> summary(lm(y~a+b,contrasts = c(a = contr.weights.121,
+ b = contr.sum)))
Call:
lm(formula = y ~ a + b, contrasts = c(a = contr.weights.121,
b = contr.sum))
Residuals:
1 2 3 4 5 6
-24 -12 36 24 12 -36
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.530e+02 1.944e+01 7.869 0.0158 *
a1 -9.000e+00 2.970e+01 -0.303 0.7905
a2 3.000e+00 1.944e+01 0.154 0.8915
b1 -6.020e-15 1.833e+01 0.000 1.0000
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 44.9 on 2 degrees of freedom
Multiple R-squared: 0.04545, Adjusted R-squared: -1.386
F-statistic: 0.03175 on 3 and 2 DF, p-value: 0.9903
34.2 Pooling
Pooling is the practice of dropping non-significant terms from a model. Their SS and df thus appear in (get added to) error. They are said to be pooled into error.
I only recommend doing this if
- You have very few degrees of freedom for error and the F ratio for the potentially pooled term is less than 2, or
- The data are unbalanced (or more generally, your model terms are not all orthogonal as might happen with polynomial models) and you want to look at coefficients.
34.2.1 Carbon wire
Let’s look at the carbon wire data again.
> wire <- read.table("carbonwire.txt",header=TRUE)
> names(wire)
[1] "resist" "degas" "temp" "diff"
> wire <- within(wire, {degas <- factor(degas); temp <- factor(temp);
+ diff <- factor(diff)})
> wire.fit <- lm(resist~degas*temp*diff,data=wire)
> anova(wire.fit)
Analysis of Variance Table
Response: resist
Df Sum Sq Mean Sq F value Pr(>F)
degas 1 0.47 0.467 1.5918 0.2125
temp 2 410.69 205.346 699.6024 < 2.2e-16 ***
diff 2 80.54 40.270 137.1989 < 2.2e-16 ***
degas:temp 2 0.31 0.157 0.5342 0.5892
degas:diff 2 0.70 0.351 1.1957 0.3104
temp:diff 4 14.81 3.704 12.6177 2.566e-07 ***
degas:temp:diff 4 0.87 0.219 0.7450 0.5656
Residuals 54 15.85 0.294
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
No term involving degas
was significant. Some might
argue to remove those terms and refit as below.
I would say not to do that. The data are balanced,
so removing terms will not change coefficient
estimates, and we already have 54 df for error.
We don’t need to risk biasing our MSE.
> wire.fit.pooled <- lm(resist~temp*diff,data=wire)
> anova(wire.fit.pooled)
Analysis of Variance Table
Response: resist
Df Sum Sq Mean Sq F value Pr(>F)
temp 2 410.69 205.346 710.521 < 2.2e-16 ***
diff 2 80.54 40.270 139.340 < 2.2e-16 ***
temp:diff 4 14.81 3.704 12.815 1.085e-07 ***
Residuals 63 18.21 0.289
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
34.3 Single replications
Single replication experiments mean that we have no df for error, so we need to assemble some kind of surrogate estimate of error.
The most common approach is to drop one (or more) high-order interaction effects out of the model, thus pooling them into error. This is plausible based on the observation that interactions are typically smaller than main effects, and higher order interactions are typically the smallest.
Note that “typically smaller” does not mean always smaller. Whenever we construct a surrogate estimate of error, we run the risk of biasing our estimate of error.
34.3.1 Zinc retention
Let’s go back to the zinc retention data. That three-factor experiment had just one replication, so we could decide to use the three-factor interaction as error.
> data(ZincRetention)
> zinc.3out <- lm(retention~(m.protein+d.zinc+m.zinc)^2,data=ZincRetention)
> anova(zinc.3out)
Analysis of Variance Table
Response: retention
Df Sum Sq Mean Sq F value Pr(>F)
m.protein 3 827.25 275.75 27.1230 0.011261 *
d.zinc 1 756.25 756.25 74.3852 0.003278 **
m.zinc 1 1225.00 1225.00 120.4918 0.001619 **
m.protein:d.zinc 3 28.25 9.42 0.9262 0.524375
m.protein:m.zinc 3 298.50 99.50 9.7869 0.046563 *
d.zinc:m.zinc 1 4.00 4.00 0.3934 0.574988
Residuals 3 30.50 10.17
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the ANOVA we see that there are only 3 df for error (
this is the three factor interaction). That is not many,
so we look to see if we can pool. Both m.protein:d.zinc
and d.zinc:m.zinc
have F statistics less than 2 (and
are non-significant), so we refit without them. We are
now up to 7 df for error, which is better.
> zinc.reduced <- lm(retention~m.protein*m.zinc+d.zinc,data=ZincRetention)
> anova(zinc.reduced)
Analysis of Variance Table
Response: retention
Df Sum Sq Mean Sq F value Pr(>F)
m.protein 3 827.25 275.75 30.761 0.0002106 ***
m.zinc 1 1225.00 1225.00 136.653 7.578e-06 ***
d.zinc 1 756.25 756.25 84.362 3.736e-05 ***
m.protein:m.zinc 3 298.50 99.50 11.100 0.0047289 **
Residuals 7 62.75 8.96
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
What the pooling really was saying is that the two additional two-factor interactions we dropped are of roughly the same size (from an MS perspective) as the three-factor interaction. If the three-factor interaction was ignorable, then great. If the three-factor interaction was actually something beyond random noise, then all that as happened is that we are now more certain (more df) of a biased estimate of error.
Can we check this? In the full factorial model, the expected values of the residuals are zero, because we fit all the treatment means. In the reduced model, the expected values of the residuals follow the interaction(s) that were pooled into error.
Stipulating that residual plots with few df for error are difficult to interpret, we can look at our residuals. If they are not always centered around zero (ie, if there is a pattern in the mean), then our surrogate estimate of error is biased upwards. If it is biased upwards, then we are doing conservative tests; the reported p-values are likely larger than they should be.
Let’s plot the residuals from the two-way fit. That looks like a pattern to me, so the tests are probably conservative.
> plot(zinc.3out,which=1)

Figure 34.5: Residual plot for two-way model of zinc retention
34.3.2 Page faults
The page fault data are a single replication of the number of page faults a computer has as a function of four factors: amount of RAM, size of the program, initialization sequence, and paging algorithm.
We note that the response faults
is a non-negative
variable that ranges over several orders of magnitude.
The first quartile is 4.25 times the minimum;
the median is 5.75 times the first quartile;
the third quartile is 4.25 times the median; and
the maximum is 5.13 times third quartile.
These data are practically begging for a log
transformation.
> data(PageFaults)
> head(PageFaults)
alg init size ram faults
1 1 1 large large 32
2 1 1 large medium 48
3 1 1 large small 538
4 1 1 medium large 53
5 1 1 medium medium 81
6 1 1 medium small 1901
> summary(PageFaults$faults)
Min. 1st Qu. Median Mean 3rd Qu. Max.
32.0 136.0 782.5 2521.2 3333.5 17117.0
To get started, we need to pool some terms into error. One reasonable place to start is to remove only the four-factor interaction. That is only going to be 8 df for error, so another possibility would be to pool all three and four-factor interactions. The latter has the advantage of more df for error and the risk of pooling an active term into error.
> faults.out4 <- lm(faults~(alg+init+size+ram)^3,data=PageFaults)
> faults.out34 <- lm(faults~(alg+init+size+ram)^2,data=PageFaults)
You can’t really see much in the residuals plot of faults.out4
.
> plot(faults.out4,which=1)

Figure 34.6: Residuals plot of three-way model of page faults
However, Box-Cox is really convinced of the need to transform, with the optimum somewhere around power .1. Log is right at the edge of the interval, and 1 is so far away it’s funny. We really need to transform.
> car::boxCox(faults.out4,lambda=seq(-.5,.5,.01))

Figure 34.7: Box-Cox plot of three-way model of page faults
Now let’s look at the two-way model. Wow! When you do not include all terms in your model, some non-ignorable terms can drop down into error giving you systematic patterns in the residuals. I call this pattern the flopping fish.
This can often be fixed via transformation, but you sometimes need additional predictive terms.
> plot(faults.out34,which=1)

Figure 34.8: Residuals plot of two-way model of page faults
Box-Cox says that log is the right thing to do.
> car::boxCox(faults.out34)

Figure 34.9: Box-Cox plot of two-way model of page faults
OK, so log transformation it is. Let’s refit our two potential models with logged data.
> logfaults.out4 <- lm(log(faults)~(alg+init+size+ram)^3,data=PageFaults)
> logfaults.out34 <- lm(log(faults)~(alg+init+size+ram)^2,data=PageFaults)
Now we’ll check residuals for the transformed data. Looks OK when we just drop the four way.
> plot(logfaults.out4,which=1)

Figure 34.10: Residual plot of three way logged paged faults
But the plot of the two-way model has a problem: there is a substantial pattern with the local average of the residuals moving up and down (the variability around that pattern is fairly constant, but that pattern …). This means that one or more of the three-way terms is large and inflating the MSE.
> plot(logfaults.out34,which=1)

Figure 34.11: Residual plot of two way logged paged faults
OK, we can now start looking at ANOVA. We used the three-way model with logged data.
There are only 8 df for error, which isn’t very many. There are two non-significant terms with F statistics less than 2 that we can pool into error and still maintain hierarchy.
> anova(logfaults.out4)
Analysis of Variance Table
Response: log(faults)
Df Sum Sq Mean Sq F value Pr(>F)
alg 1 2.502 2.502 877.9857 1.824e-09 ***
init 2 24.639 12.320 4323.4054 7.300e-13 ***
size 2 41.692 20.846 7315.5596 8.919e-14 ***
ram 2 92.697 46.349 16265.4282 3.654e-15 ***
alg:init 2 0.018 0.009 3.0947 0.101042
alg:size 2 0.022 0.011 3.8979 0.065794 .
alg:ram 2 0.060 0.030 10.5350 0.005736 **
init:size 4 0.829 0.207 72.7278 2.511e-06 ***
init:ram 4 9.510 2.378 834.3927 1.632e-10 ***
size:ram 4 0.504 0.126 44.2447 1.689e-05 ***
alg:init:size 4 0.015 0.004 1.2778 0.354761
alg:init:ram 4 0.026 0.007 2.2818 0.149073
alg:size:ram 4 0.004 0.001 0.3511 0.836454
init:size:ram 8 1.052 0.132 46.1535 6.726e-06 ***
Residuals 8 0.023 0.003
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Let’s take those two terms out. Note the use of update()
to
change a model without needing to type everything in again.
This final model has some highly significant terms and a handful of marginally signficant terms. However, they have F ratios larger than 2 and we have 16 df for error, so we stay with this model.
> logfaults.final <- update(logfaults.out4,~.-alg:size:ram-alg:init:size)
> anova(logfaults.final)
Analysis of Variance Table
Response: log(faults)
Df Sum Sq Mean Sq F value Pr(>F)
alg 1 2.502 2.502 967.7896 9.683e-16 ***
init 2 24.639 12.320 4765.6206 < 2.2e-16 ***
size 2 41.692 20.846 8063.8244 < 2.2e-16 ***
ram 2 92.697 46.349 17929.1216 < 2.2e-16 ***
alg:init 2 0.018 0.009 3.4113 0.0583516 .
alg:size 2 0.022 0.011 4.2966 0.0320944 *
alg:ram 2 0.060 0.030 11.6126 0.0007664 ***
init:size 4 0.829 0.207 80.1667 2.243e-10 ***
init:ram 4 9.510 2.378 919.7377 < 2.2e-16 ***
size:ram 4 0.504 0.126 48.7702 9.148e-09 ***
alg:init:ram 4 0.026 0.007 2.5151 0.0825350 .
init:size:ram 8 1.052 0.132 50.8743 6.241e-10 ***
Residuals 16 0.041 0.003
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We were primarily interested in the effects of algorithm.
Most of the interactions involving algorithm are
marginally significant, but alg:ram
has a p-value
less than .001. Use emmeans()
to see what is happening.
Algorithm 1 produces fewer mean faults than algorithm 2 at all levels of RAM, but it’s advantage for small RAM (.34) is less than its advantage at medium or large (.48).
> margin <- emmeans::emmeans(logfaults.final,~alg:ram)
NOTE: Results may be misleading due to involvement in interactions
> margin
alg ram emmean SE df lower.CL upper.CL
1 large 4.710 0.01695 16 4.674 4.746
2 large 5.190 0.01695 16 5.154 5.226
1 medium 6.227 0.01695 16 6.191 6.263
2 medium 6.702 0.01695 16 6.666 6.738
1 small 7.990 0.01695 16 7.954 8.026
2 small 8.326 0.01695 16 8.290 8.362
Results are averaged over the levels of: init, size
Results are given on the log (not the response) scale.
Confidence level used: 0.95
> matrix(predict(margin),ncol=2,byrow=TRUE)
[,1] [,2]
[1,] 4.710091 5.190324
[2,] 6.227057 6.702067
[3,] 7.989746 8.325972
> linear.contrast(logfaults.final,alg,c(-1,1),byvar=ram)
estimates se t-value p-value lower-ci upper-ci
1 0.4802324 0.02396805 20.03635 9.304802e-13 0.4294224 0.5310424
2 0.4750106 0.02396805 19.81849 1.101281e-12 0.4242006 0.5258206
3 0.3362258 0.02396805 14.02808 2.077443e-10 0.2854158 0.3870358
attr(,"class")
[1] "linear.contrast"
Finally, we return the our old adage that interaction depends on scale. Look at the two side by side plots below. For the raw data, many interactions are the size of main effects. On the log scale, many interactions are small relative to main effects. The transformation has not only improved the behavior of the residuals, it has simplified the model.
> sidebyside(faults.out4)

Figure 34.12: Side by side plot of page faults on original scale
> sidebyside(logfaults.out4)

Figure 34.13: Side by side plot fo page faults on log scale