Chapter 43 Confounding
43.1 conf.design
It might not surprise you to learn that
the package conf.design
contains tools for working
with designs with confounding.
> library(cfcdae)
> library(conf.design)
Attaching package: 'conf.design'
The following object is masked from 'package:lme4':
factorize
> library(car)
43.1.1 Generating designs
conf.design
uses matrices to denote generators
for designs. Each row of a matrix corresponds
to a generator, and the columns correspond to all
the factors in the design, regardless of how
many are in the generator. A 1 indicates that
the factor is in the generator, and a 0 indicates
it is not.
Here is a matrix for a \(2^4\) design with ABD as generator.
> ABD.4 <- t(c(1,1,0,1))
> ABD.4
[,1] [,2] [,3] [,4]
[1,] 1 1 0 1
Here is a matrix for a \(2^4\) design with ABD and CD as generators. We show a variety of ways of creating the same matrix. I think the first is easiest to think about, but to each his own.
> ABD.CD.4 <- rbind(c(1,1,0,1),c(0,0,1,1))
> ABD.CD.4
[,1] [,2] [,3] [,4]
[1,] 1 1 0 1
[2,] 0 0 1 1
> ABD.CD.4 <- matrix(c(1,0,1,0,0,1,1,1),nrow=2)
> ABD.CD.4
[,1] [,2] [,3] [,4]
[1,] 1 1 0 1
[2,] 0 0 1 1
> ABD.CD.4 <- matrix(c(1,1,0,1,0,0,1,1),nrow=2,byrow=TRUE)
> ABD.CD.4
[,1] [,2] [,3] [,4]
[1,] 1 1 0 1
[2,] 0 0 1 1
This is for ABD and ABCD; we know that this is a bad idea.
> ABD.ABCD.4 <- rbind(c(1,1,0,1),c(1,1,1,1))
> ABD.ABCD.4
[,1] [,2] [,3] [,4]
[1,] 1 1 0 1
[2,] 1 1 1 1
And lets make a big one. Use ABCE, ABDF, ACDG, and BCDH as generators in a \(2^8\). Note that we can name the variables, which could save us some bother later on.
> big8 <- rbind(c(A=1,B=1,C=1,D=0,E=1,F=0,G=0,H=0),c(1,1,0,1,0,1,0,0),c(1,0,1,1,0,0,1,0), c(0,1,1,1,0,0,0,1))
> big8
A B C D E F G H
[1,] 1 1 1 0 1 0 0 0
[2,] 1 1 0 1 0 1 0 0
[3,] 1 0 1 1 0 0 1 0
[4,] 0 1 1 1 0 0 0 1
When setting up any confounded design, we need to know
what we are confounding. conf.set()
takes a matrix
of generators and returns all of the confounded effects.
The 2 as a second argument says that we are in a two-series
design.
OK, it’s pretty straitforward to figure out what is confounded when we only have one generator.
> conf.set(ABD.4,2)
[1] 1 1 0 1
And it’s really pretty easy when there are only two generators, because there is only one generalized interaction.
> conf.set(ABD.CD.4,2)
[,1] [,2] [,3] [,4]
[1,] 1 1 0 1
[2,] 0 0 1 1
[3,] 1 1 1 0
And this shows that if you use ABD and ABCD you also confound the main effect of C. Bad idea.
> conf.set(ABD.ABCD.4,2)
[,1] [,2] [,3] [,4]
[1,] 1 1 0 1
[2,] 1 1 1 1
[3,] 0 0 1 0
When we have four generators creating 16 blocks in a \(2^8\),
then conf.set()
really helps us.
> conf.set(big8,2)
A B C D E F G H
[1,] 1 1 1 0 1 0 0 0
[2,] 1 1 0 1 0 1 0 0
[3,] 0 0 1 1 1 1 0 0
[4,] 1 0 1 1 0 0 1 0
[5,] 0 1 0 1 1 0 1 0
[6,] 0 1 1 0 0 1 1 0
[7,] 1 0 0 0 1 1 1 0
[8,] 0 1 1 1 0 0 0 1
[9,] 1 0 0 1 1 0 0 1
[10,] 1 0 1 0 0 1 0 1
[11,] 0 1 0 0 1 1 0 1
[12,] 1 1 0 0 0 0 1 1
[13,] 0 0 1 0 1 0 1 1
[14,] 0 0 0 1 0 1 1 1
[15,] 1 1 1 1 1 1 1 1
> letter.effect <- function(binary) {
+ Letters <- LETTERS[1:length(binary)]
+ paste(Letters[binary>0],collapse=":")
+ }
> apply(conf.set(big8,2),1,letter.effect)
[1] "A:B:C:E" "A:B:D:F" "C:D:E:F" "A:C:D:G" "B:D:E:G"
[6] "B:C:F:G" "A:E:F:G" "B:C:D:H" "A:D:E:H" "A:C:F:H"
[11] "B:E:F:H" "A:B:G:H" "C:E:G:H" "D:F:G:H" "A:B:C:D:E:F:G:H"
In addition to finding the generalized interactions,
we can use conf.design()
to determine the actual design.
With just two blocks, blocks are indicated by a single 0/1. Levels of each treatment are also indicated by 0/1. Thus block 0 below includes (1), ab, c, abc, ad, bd, acd, and bcd.
> conf.design(ABD.4,2)
Blocks T1 T2 T3 T4
1 0 0 0 0 0
2 0 1 1 0 0
3 0 0 0 1 0
4 0 1 1 1 0
5 0 1 0 0 1
6 0 0 1 0 1
7 0 1 0 1 1
8 0 0 1 1 1
9 1 1 0 0 0
10 1 0 1 0 0
11 1 1 0 1 0
12 1 0 1 1 0
13 1 0 0 0 1
14 1 1 1 0 1
15 1 0 0 1 1
16 1 1 1 1 1
With four blocks, blocks are indicated by pairs 00, 01, 10, or 11 corresponding to the two generators. Below, the 01 block includes c, abc, ad, bd.
> conf.design(ABD.CD.4,2)
Blocks T1 T2 T3 T4
1 00 0 0 0 0
2 00 1 1 0 0
3 00 1 0 1 1
4 00 0 1 1 1
5 01 0 0 1 0
6 01 1 1 1 0
7 01 1 0 0 1
8 01 0 1 0 1
9 10 1 0 0 0
10 10 0 1 0 0
11 10 0 0 1 1
12 10 1 1 1 1
13 11 1 0 1 0
14 11 0 1 1 0
15 11 0 0 0 1
16 11 1 1 0 1
And now the big one. At least with my defaults, R won’t even print out the entire design. Sixteen blocks indicated by 0000, 0001, 0010, 0011, … 1111.
> conf.design(big8,2)
Blocks A B C D E F G H
1 0000 0 0 0 0 0 0 0 0
2 0000 1 1 1 0 1 0 0 0
3 0000 1 1 0 1 0 1 0 0
4 0000 0 0 1 1 1 1 0 0
5 0000 1 0 1 1 0 0 1 0
6 0000 0 1 0 1 1 0 1 0
7 0000 0 1 1 0 0 1 1 0
8 0000 1 0 0 0 1 1 1 0
9 0000 0 1 1 1 0 0 0 1
10 0000 1 0 0 1 1 0 0 1
11 0000 1 0 1 0 0 1 0 1
12 0000 0 1 0 0 1 1 0 1
13 0000 1 1 0 0 0 0 1 1
14 0000 0 0 1 0 1 0 1 1
15 0000 0 0 0 1 0 1 1 1
16 0000 1 1 1 1 1 1 1 1
17 0001 0 1 1 1 0 0 0 0
18 0001 1 0 0 1 1 0 0 0
19 0001 1 0 1 0 0 1 0 0
20 0001 0 1 0 0 1 1 0 0
21 0001 1 1 0 0 0 0 1 0
22 0001 0 0 1 0 1 0 1 0
23 0001 0 0 0 1 0 1 1 0
24 0001 1 1 1 1 1 1 1 0
25 0001 0 0 0 0 0 0 0 1
26 0001 1 1 1 0 1 0 0 1
27 0001 1 1 0 1 0 1 0 1
28 0001 0 0 1 1 1 1 0 1
29 0001 1 0 1 1 0 0 1 1
30 0001 0 1 0 1 1 0 1 1
31 0001 0 1 1 0 0 1 1 1
32 0001 1 0 0 0 1 1 1 1
33 0010 1 0 1 1 0 0 0 0
34 0010 0 1 0 1 1 0 0 0
35 0010 0 1 1 0 0 1 0 0
36 0010 1 0 0 0 1 1 0 0
37 0010 0 0 0 0 0 0 1 0
38 0010 1 1 1 0 1 0 1 0
39 0010 1 1 0 1 0 1 1 0
40 0010 0 0 1 1 1 1 1 0
41 0010 1 1 0 0 0 0 0 1
42 0010 0 0 1 0 1 0 0 1
43 0010 0 0 0 1 0 1 0 1
44 0010 1 1 1 1 1 1 0 1
45 0010 0 1 1 1 0 0 1 1
46 0010 1 0 0 1 1 0 1 1
47 0010 1 0 1 0 0 1 1 1
48 0010 0 1 0 0 1 1 1 1
49 0011 1 1 0 0 0 0 0 0
50 0011 0 0 1 0 1 0 0 0
51 0011 0 0 0 1 0 1 0 0
52 0011 1 1 1 1 1 1 0 0
53 0011 0 1 1 1 0 0 1 0
54 0011 1 0 0 1 1 0 1 0
55 0011 1 0 1 0 0 1 1 0
56 0011 0 1 0 0 1 1 1 0
57 0011 1 0 1 1 0 0 0 1
58 0011 0 1 0 1 1 0 0 1
59 0011 0 1 1 0 0 1 0 1
60 0011 1 0 0 0 1 1 0 1
61 0011 0 0 0 0 0 0 1 1
62 0011 1 1 1 0 1 0 1 1
63 0011 1 1 0 1 0 1 1 1
64 0011 0 0 1 1 1 1 1 1
65 0100 1 1 0 1 0 0 0 0
66 0100 0 0 1 1 1 0 0 0
67 0100 0 0 0 0 0 1 0 0
68 0100 1 1 1 0 1 1 0 0
69 0100 0 1 1 0 0 0 1 0
70 0100 1 0 0 0 1 0 1 0
71 0100 1 0 1 1 0 1 1 0
72 0100 0 1 0 1 1 1 1 0
73 0100 1 0 1 0 0 0 0 1
74 0100 0 1 0 0 1 0 0 1
75 0100 0 1 1 1 0 1 0 1
76 0100 1 0 0 1 1 1 0 1
77 0100 0 0 0 1 0 0 1 1
78 0100 1 1 1 1 1 0 1 1
79 0100 1 1 0 0 0 1 1 1
80 0100 0 0 1 0 1 1 1 1
81 0101 1 0 1 0 0 0 0 0
82 0101 0 1 0 0 1 0 0 0
83 0101 0 1 1 1 0 1 0 0
84 0101 1 0 0 1 1 1 0 0
85 0101 0 0 0 1 0 0 1 0
86 0101 1 1 1 1 1 0 1 0
87 0101 1 1 0 0 0 1 1 0
88 0101 0 0 1 0 1 1 1 0
89 0101 1 1 0 1 0 0 0 1
90 0101 0 0 1 1 1 0 0 1
91 0101 0 0 0 0 0 1 0 1
92 0101 1 1 1 0 1 1 0 1
93 0101 0 1 1 0 0 0 1 1
94 0101 1 0 0 0 1 0 1 1
95 0101 1 0 1 1 0 1 1 1
96 0101 0 1 0 1 1 1 1 1
97 0110 0 1 1 0 0 0 0 0
98 0110 1 0 0 0 1 0 0 0
99 0110 1 0 1 1 0 1 0 0
100 0110 0 1 0 1 1 1 0 0
101 0110 1 1 0 1 0 0 1 0
102 0110 0 0 1 1 1 0 1 0
103 0110 0 0 0 0 0 1 1 0
104 0110 1 1 1 0 1 1 1 0
105 0110 0 0 0 1 0 0 0 1
106 0110 1 1 1 1 1 0 0 1
107 0110 1 1 0 0 0 1 0 1
108 0110 0 0 1 0 1 1 0 1
109 0110 1 0 1 0 0 0 1 1
110 0110 0 1 0 0 1 0 1 1
111 0110 0 1 1 1 0 1 1 1
112 0110 1 0 0 1 1 1 1 1
113 0111 0 0 0 1 0 0 0 0
114 0111 1 1 1 1 1 0 0 0
115 0111 1 1 0 0 0 1 0 0
116 0111 0 0 1 0 1 1 0 0
117 0111 1 0 1 0 0 0 1 0
118 0111 0 1 0 0 1 0 1 0
119 0111 0 1 1 1 0 1 1 0
120 0111 1 0 0 1 1 1 1 0
121 0111 0 1 1 0 0 0 0 1
122 0111 1 0 0 0 1 0 0 1
123 0111 1 0 1 1 0 1 0 1
124 0111 0 1 0 1 1 1 0 1
125 0111 1 1 0 1 0 0 1 1
126 0111 0 0 1 1 1 0 1 1
127 0111 0 0 0 0 0 1 1 1
128 0111 1 1 1 0 1 1 1 1
129 1000 1 1 1 0 0 0 0 0
130 1000 0 0 0 0 1 0 0 0
131 1000 0 0 1 1 0 1 0 0
132 1000 1 1 0 1 1 1 0 0
133 1000 0 1 0 1 0 0 1 0
134 1000 1 0 1 1 1 0 1 0
135 1000 1 0 0 0 0 1 1 0
136 1000 0 1 1 0 1 1 1 0
137 1000 1 0 0 1 0 0 0 1
138 1000 0 1 1 1 1 0 0 1
139 1000 0 1 0 0 0 1 0 1
140 1000 1 0 1 0 1 1 0 1
141 1000 0 0 1 0 0 0 1 1
142 1000 1 1 0 0 1 0 1 1
143 1000 1 1 1 1 0 1 1 1
144 1000 0 0 0 1 1 1 1 1
145 1001 1 0 0 1 0 0 0 0
146 1001 0 1 1 1 1 0 0 0
147 1001 0 1 0 0 0 1 0 0
148 1001 1 0 1 0 1 1 0 0
149 1001 0 0 1 0 0 0 1 0
150 1001 1 1 0 0 1 0 1 0
151 1001 1 1 1 1 0 1 1 0
152 1001 0 0 0 1 1 1 1 0
153 1001 1 1 1 0 0 0 0 1
154 1001 0 0 0 0 1 0 0 1
155 1001 0 0 1 1 0 1 0 1
156 1001 1 1 0 1 1 1 0 1
157 1001 0 1 0 1 0 0 1 1
158 1001 1 0 1 1 1 0 1 1
159 1001 1 0 0 0 0 1 1 1
160 1001 0 1 1 0 1 1 1 1
161 1010 0 1 0 1 0 0 0 0
162 1010 1 0 1 1 1 0 0 0
163 1010 1 0 0 0 0 1 0 0
164 1010 0 1 1 0 1 1 0 0
165 1010 1 1 1 0 0 0 1 0
166 1010 0 0 0 0 1 0 1 0
167 1010 0 0 1 1 0 1 1 0
168 1010 1 1 0 1 1 1 1 0
169 1010 0 0 1 0 0 0 0 1
170 1010 1 1 0 0 1 0 0 1
171 1010 1 1 1 1 0 1 0 1
172 1010 0 0 0 1 1 1 0 1
173 1010 1 0 0 1 0 0 1 1
174 1010 0 1 1 1 1 0 1 1
175 1010 0 1 0 0 0 1 1 1
176 1010 1 0 1 0 1 1 1 1
177 1011 0 0 1 0 0 0 0 0
178 1011 1 1 0 0 1 0 0 0
179 1011 1 1 1 1 0 1 0 0
180 1011 0 0 0 1 1 1 0 0
181 1011 1 0 0 1 0 0 1 0
182 1011 0 1 1 1 1 0 1 0
183 1011 0 1 0 0 0 1 1 0
184 1011 1 0 1 0 1 1 1 0
185 1011 0 1 0 1 0 0 0 1
186 1011 1 0 1 1 1 0 0 1
187 1011 1 0 0 0 0 1 0 1
188 1011 0 1 1 0 1 1 0 1
189 1011 1 1 1 0 0 0 1 1
190 1011 0 0 0 0 1 0 1 1
191 1011 0 0 1 1 0 1 1 1
192 1011 1 1 0 1 1 1 1 1
193 1100 0 0 1 1 0 0 0 0
194 1100 1 1 0 1 1 0 0 0
195 1100 1 1 1 0 0 1 0 0
196 1100 0 0 0 0 1 1 0 0
197 1100 1 0 0 0 0 0 1 0
198 1100 0 1 1 0 1 0 1 0
199 1100 0 1 0 1 0 1 1 0
200 1100 1 0 1 1 1 1 1 0
201 1100 0 1 0 0 0 0 0 1
202 1100 1 0 1 0 1 0 0 1
203 1100 1 0 0 1 0 1 0 1
204 1100 0 1 1 1 1 1 0 1
205 1100 1 1 1 1 0 0 1 1
206 1100 0 0 0 1 1 0 1 1
207 1100 0 0 1 0 0 1 1 1
208 1100 1 1 0 0 1 1 1 1
209 1101 0 1 0 0 0 0 0 0
210 1101 1 0 1 0 1 0 0 0
211 1101 1 0 0 1 0 1 0 0
212 1101 0 1 1 1 1 1 0 0
213 1101 1 1 1 1 0 0 1 0
214 1101 0 0 0 1 1 0 1 0
215 1101 0 0 1 0 0 1 1 0
216 1101 1 1 0 0 1 1 1 0
217 1101 0 0 1 1 0 0 0 1
218 1101 1 1 0 1 1 0 0 1
219 1101 1 1 1 0 0 1 0 1
220 1101 0 0 0 0 1 1 0 1
221 1101 1 0 0 0 0 0 1 1
222 1101 0 1 1 0 1 0 1 1
223 1101 0 1 0 1 0 1 1 1
224 1101 1 0 1 1 1 1 1 1
225 1110 1 0 0 0 0 0 0 0
226 1110 0 1 1 0 1 0 0 0
227 1110 0 1 0 1 0 1 0 0
228 1110 1 0 1 1 1 1 0 0
229 1110 0 0 1 1 0 0 1 0
230 1110 1 1 0 1 1 0 1 0
231 1110 1 1 1 0 0 1 1 0
232 1110 0 0 0 0 1 1 1 0
233 1110 1 1 1 1 0 0 0 1
234 1110 0 0 0 1 1 0 0 1
235 1110 0 0 1 0 0 1 0 1
236 1110 1 1 0 0 1 1 0 1
237 1110 0 1 0 0 0 0 1 1
238 1110 1 0 1 0 1 0 1 1
239 1110 1 0 0 1 0 1 1 1
240 1110 0 1 1 1 1 1 1 1
241 1111 1 1 1 1 0 0 0 0
242 1111 0 0 0 1 1 0 0 0
243 1111 0 0 1 0 0 1 0 0
244 1111 1 1 0 0 1 1 0 0
245 1111 0 1 0 0 0 0 1 0
246 1111 1 0 1 0 1 0 1 0
247 1111 1 0 0 1 0 1 1 0
248 1111 0 1 1 1 1 1 1 0
249 1111 1 0 0 0 0 0 0 1
250 1111 0 1 1 0 1 0 0 1
251 1111 0 1 0 1 0 1 0 1
252 1111 1 0 1 1 1 1 0 1
253 1111 0 0 1 1 0 0 1 1
254 1111 1 1 0 1 1 0 1 1
255 1111 1 1 1 0 0 1 1 1
256 1111 0 0 0 0 1 1 1 1
43.2 Analysis of confounded designs
43.2.1 Speed bump data
These data are adapted from Khademi, et al (2013). They’re trying to figure out where to put speed bumps. The factors are number of passengers (1 or 5), car speed (10 or 30 kph), distance from bump to stop point (10 or 30 m), and surface inclincation (0 or 7 degrees). The response is time to reach the stop point. The experiment was run over two days (pun very much intended), confounding the four factor interaction.
> seconds <- c(3.45,3.36,1.41,1.44,6.21,6.69,3.14,3.22,3.68,3.89,1.76,1.81,8.39,8.31,3.23,3.30)
> pass <- factor(rep(c(1,5),length=16))
> speed <- factor(rep(c(10,30),each=2,length=16))
> dist <- factor(rep(c(10,20),each=4,length=16))
> incl <- factor(rep(c(0,7),each=8,length=16))
Looking at the TwoSeriesPlots()
, Basso-Salmaso and PSE agree
that everything that does not involve pass
is significant at .01.
In particular, the four-factor interaction (confounding
with blocks) is not a “big” term that we need to ignore.
> bump.lm <- lm(seconds~pass*speed*dist*incl)
> set.seed(12345)
> TwoSeriesPlots(bump.lm,pse=TRUE,alpha=.01)
B set to 13934

Figure 43.1: Half normal plot of speed bump data
Looking at the residual plot from the model including
speed:dist
shows nonconstant variance.
> plot(lm(seconds~speed*dist*incl),which=1)

Figure 43.2: Residual plot for two-way model of speed bump data
Box-Cox says to use a power between -1 (reciprocal, or average speed until stop) and 0 (log).
> boxCox(lm(seconds~speed*dist*incl))

Figure 43.3: Box-Cox for two-way model of speed bump data
Looking again at the log scale brings a surprise. The
speed:dist:incl
three-way is significant; hierarchy would
say we need to include everything in our model except terms
involving pass
. (In some simulation runs, only the
three main effects and the three-way show up as significant;
in other runs, everthing not involving pass
is significant.)
> logbump.lm <- lm(log(seconds)~pass*speed*dist*incl)
> TwoSeriesPlots(logbump.lm)
B set to 3000

Here’s what that interaction looks like. Inclination has very little effect when both speed and distance are high, or both speed and distance are low. But it does have an effect when one is high and the other is low.
> interactplot(speed:dist,incl,log(seconds))

Figure 43.4: Interaction plot for bump data
43.3 DNPK data
These data are from a \(2^4\) design replicated twice, blocked into four blocks of size 8 with dnpk confounded with blocks in both replicates. dnpk is thus completely confounded. Data are from Cochran and Cox.
Notice that in the first block of each replication, there are always an odd number of factors at the high level (either 1 or 3), whereas in block 2 of each replication there is always an even number of factors at the high level (0, 2, or 4).
> dnpk <- read.table("dnpk.dat.txt",header=TRUE)
> names(dnpk)
[1] "d" "n" "p" "k" "block" "rpl" "yield"
> dnpk <- within(dnpk,{d <- factor(d);n <- factor(n);
+ p <- factor(p);k <- factor(k)})
> dnpk <- within(dnpk,{block <- factor(block);rpl<-factor(rpl)})
> dnpk
d n p k block rpl yield
1 1 1 2 1 1 1 45
2 1 1 1 2 1 1 55
3 2 1 1 1 1 1 53
4 1 2 2 2 1 1 36
5 2 2 1 2 1 1 41
6 2 2 2 1 1 1 48
7 2 1 2 2 1 1 55
8 1 2 1 1 1 1 42
9 2 1 2 1 2 1 50
10 1 2 1 2 2 1 44
11 2 1 1 2 2 1 43
12 1 1 2 2 2 1 51
13 2 2 2 2 2 1 44
14 1 1 1 1 2 1 58
15 2 2 1 1 2 1 41
16 1 2 2 1 2 1 50
17 1 1 2 1 1 2 39
18 1 1 1 2 1 2 50
19 2 1 1 1 1 2 42
20 1 2 2 2 1 2 43
21 2 2 1 2 1 2 34
22 2 2 2 1 1 2 52
23 2 1 2 2 1 2 44
24 1 2 1 1 1 2 47
25 2 1 2 1 2 2 52
26 1 2 1 2 2 2 43
27 2 1 1 2 2 2 52
28 1 1 2 2 2 2 56
29 2 2 2 2 2 2 54
30 1 1 1 1 2 2 57
31 2 2 1 1 2 2 42
32 1 2 2 1 2 2 39
Here
are the basic model and ANOVA.
These data were set up with blocks numbered 1 and 2 in each
replication, so the replication by block “interaction” actually
enumerates all four blocks, with 3 degrees of freedom between
the four blocks. We want treatments adjusted for blocks, and we
did not quite get it here, because R wants to put two factor terms (in
this case, all blocks is a two factor term) after main effects. In this case
is does not matter, but in other cases it might. In those
cases, we need to use the terms()
function to get the terms in
the order we want, or we need to make a single factor to enumerate
all of the blocks (perhaps using join()
).
Note that the four factor interaction dnpk does not even show up in this table. That is because it is confounded with blocks within each replication and has 0 degrees of freedom. It cannot be estimated because it is completely confounded with blocks.
> dnpk.lm <- lm(yield~rpl:block+d*n*p*k,data=dnpk)
> anova(dnpk.lm)
Analysis of Variance Table
Response: yield
Df Sum Sq Mean Sq F value Pr(>F)
d 1 2.00 2.00 0.0824 0.778258
n 1 325.12 325.12 13.3974 0.002572 **
p 1 6.12 6.12 0.2524 0.623205
k 1 4.50 4.50 0.1854 0.673303
rpl:block 3 126.37 42.12 1.7358 0.205538
d:n 1 32.00 32.00 1.3186 0.270083
d:p 1 242.00 242.00 9.9720 0.006982 **
n:p 1 78.13 78.13 3.2193 0.094393 .
d:k 1 6.13 6.13 0.2524 0.623205
n:k 1 32.00 32.00 1.3186 0.270083
p:k 1 24.50 24.50 1.0096 0.332058
d:n:p 1 2.00 2.00 0.0824 0.778258
d:n:k 1 10.13 10.13 0.4172 0.528774
d:p:k 1 15.13 15.13 0.6233 0.443007
n:p:k 1 32.00 32.00 1.3186 0.270083
Residuals 14 339.75 24.27
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
If you look at the coefficients you will see that we have missing (NA) for the four factor interaction.
> dnpk.lm$coefficients
(Intercept) d1 n1 p1 k1 rpl1:block1 rpl2:block1 rpl1:block2
49.3750 0.2500 3.1875 -0.4375 0.3750 -2.5000 -5.5000 -1.7500
rpl2:block2 d1:n1 d1:p1 n1:p1 d1:k1 n1:k1 p1:k1 d1:n1:p1
NA 1.0000 2.7500 1.5625 -0.4375 -1.0000 0.8750 -0.2500
d1:n1:k1 d1:p1:k1 n1:p1:k1 d1:n1:p1:k1
-0.5625 0.6875 1.0000 NA
Residuals don’t look too bad … maybe decreasing variance?
> plot(dnpk.lm,which=1)

Figure 43.5: Residual plot of dnpk data
> plot(dnpk.lm,which=2)

Figure 43.6: NPP of residuals for dnpk data
The ANOVA tells us that n (nitrogren) is significant along with the d:p (dung by phosphorus) interaction (but not the main effects). Of course, hierarchy says keep the main effects. Here is what the interaction looks like. It really is purely interactive.
> interactplot(d,p,yield,data=dnpk,conf=.95,sigma2=24.27,df=14)

Figure 43.7: d:p interaction plot for dnpk data
43.4 Potatoes data
Data from John (1971). A \(2^3\) replicated four times and run in 8 blocks of 4. ABC, AB, AC, and BC are each confounded in one replication. Thus this design is a partial confounding design.
Factors are sulfate of ammonia, sulfate of potash, and nitrogen; response is yield of potatoes in pounds per plot.
> potatoes <- read.table("john.dat.txt",header=TRUE)
> potatoes <- within(potatoes,{a <- factor(a);b <- factor(b);c <- factor(c);block <- factor(block)})
> potatoes
a b c block yield
1 1 1 1 1 101
2 2 1 2 1 373
3 1 2 2 1 398
4 2 2 1 1 291
5 1 1 2 2 312
6 2 1 1 2 106
7 1 2 1 2 265
8 2 2 2 2 450
9 1 1 1 3 106
10 2 2 1 3 306
11 1 1 2 3 324
12 2 2 2 3 449
13 1 2 1 4 272
14 2 1 1 4 89
15 1 2 2 4 407
16 2 1 2 4 338
17 1 1 1 5 87
18 2 1 2 5 324
19 1 2 1 5 279
20 2 2 2 5 471
21 1 1 2 6 323
22 2 1 1 6 128
23 1 2 2 6 423
24 2 2 1 6 334
25 1 1 1 7 131
26 2 1 1 7 103
27 1 2 2 7 445
28 2 2 2 7 437
29 1 1 2 8 324
30 2 1 2 8 361
31 1 2 1 8 302
32 2 2 1 8 272
The basic model is treatments after blocks. You can estimate and test all terms, because none is completely confounded.
Potash, nitrogen, and their interaction are highly significant
(b
, c
, b:c
), ammonia is fairly significant, and
its interaction with nitrogen is marginally significant.
> potatoes.lm <- lm(yield~block+a*b*c,data=potatoes)
> anova(potatoes.lm)
Analysis of Variance Table
Response: yield
Df Sum Sq Mean Sq F value Pr(>F)
block 7 4499 643 2.0147 0.112834
a 1 3465 3465 10.8624 0.004268 **
b 1 161170 161170 505.2090 4.404e-14 ***
c 1 278818 278818 873.9916 4.666e-16 ***
a:b 1 28 28 0.0883 0.769960
a:c 1 1803 1803 5.6507 0.029457 *
b:c 1 11528 11528 36.1366 1.402e-05 ***
a:b:c 1 45 45 0.1422 0.710737
Residuals 17 5423 319
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Variance appears to be increasing, but 1 is well within the Box-Cox interval.
> plot(potatoes.lm,which=1)

Figure 43.8: Residual plot for potatoes data
We can see the effect of partial confounding in the standard errors of the effects. The main effects are never confounded, and they have smaller SEs. The ratio of the variances is .75, which means that each interaction is confounded in one of the four replications.
> summary(potatoes.lm)
Call:
lm(formula = yield ~ block + a * b * c, data = potatoes)
Residuals:
Min 1Q Median 3Q Max
-25.0938 -9.5469 0.5729 6.4531 28.2396
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 291.594 3.157 92.352 < 2e-16 ***
block1 -2.219 9.115 -0.243 0.81059
block2 -6.969 9.115 -0.765 0.45501
block3 3.573 9.115 0.392 0.69993
block4 -14.010 9.115 -1.537 0.14267
block5 -10.010 9.115 -1.098 0.28740
block6 19.073 9.115 2.093 0.05170 .
block7 9.323 9.115 1.023 0.32072
a1 -10.406 3.157 -3.296 0.00427 **
b1 -70.969 3.157 -22.477 4.40e-14 ***
c1 -93.344 3.157 -29.563 4.67e-16 ***
a1:b1 1.083 3.646 0.297 0.76996
a1:c1 8.667 3.646 2.377 0.02946 *
b1:c1 -21.917 3.646 -6.011 1.40e-05 ***
a1:b1:c1 1.375 3.646 0.377 0.71074
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 17.86 on 17 degrees of freedom
Multiple R-squared: 0.9884, Adjusted R-squared: 0.9788
F-statistic: 103.3 on 14 and 17 DF, p-value: 1.311e-13
> 3.157^2/3.646^2
[1] 0.7497489
By fooling around and only using subsets of the data, we can isolate which terms are confounded in which blocks (or we could just look at the data and think a little bit).
We can’t fit a:b:c
in the first replication (blocks 1 and 2),
so it is confounded there; we can fit everything else.
Similarly, we can’t fit
b:c
in the last replication(blocks 7 and 8), so it is
confounded there. And so on.
> lm(yield~block+a*b*c,data=potatoes,subset=as.numeric(block) < 3)
Call:
lm(formula = yield ~ block + a * b * c, data = potatoes, subset = as.numeric(block) <
3)
Coefficients:
(Intercept) block1 a1 b1 c1 a1:b1 a1:c1
287.00 3.75 -18.00 -64.00 -96.25 1.50 10.25
b1:c1 a1:b1:c1
-23.25 NA
> lm(yield~block+a*b*c,data=potatoes,subset=as.numeric(block) > 6)
Call:
lm(formula = yield ~ block + a * b * c, data = potatoes, subset = as.numeric(block) >
6)
Coefficients:
(Intercept) block1 a1 b1 c1 a1:b1 a1:c1
296.875 -17.875 3.625 -67.125 -94.875 -5.875 10.875
b1:c1 a1:b1:c1
NA 5.375
43.5 Three series
conf.design()
can handle three series designs as well.
Here we confound \(A^1B^1\).
> conf.design(c(1,1),3)
Blocks T1 T2
1 0 0 0
2 0 2 1
3 0 1 2
4 1 1 0
5 1 0 1
6 1 2 2
7 2 2 0
8 2 1 1
9 2 0 2
Something a little bigger? Here we confound a \(3^3\) (27 treatments) into nine blocks of three with \(A^1C^2\) and \(A^1B^1\) as generators.
> big3 <- rbind(c(1,0,2),c(1,1,0))
> big3
[,1] [,2] [,3]
[1,] 1 0 2
[2,] 1 1 0
The full set of confounded effects (3 df).
> conf.set(big3,3)
[,1] [,2] [,3]
[1,] 1 0 2
[2,] 1 1 0
[3,] 0 1 1
[4,] 1 2 1
And the design itself.
> conf.design(big3,3)
Blocks T1 T2 T3
1 00 0 0 0
2 00 1 2 1
3 00 2 1 2
4 01 0 1 0
5 01 1 0 1
6 01 2 2 2
7 02 0 2 0
8 02 1 1 1
9 02 2 0 2
10 10 1 2 0
11 10 2 1 1
12 10 0 0 2
13 11 1 0 0
14 11 2 2 1
15 11 0 1 2
16 12 1 1 0
17 12 2 0 1
18 12 0 2 2
19 20 2 1 0
20 20 0 0 1
21 20 1 2 2
22 21 2 2 0
23 21 0 1 1
24 21 1 0 2
25 22 2 0 0
26 22 0 2 1
27 22 1 1 2