\chapter{Comparison of Fitness Among Groups} \label{ch:newnew} <>= options(keep.source = TRUE, width = 65) ps.options(pointsize = 15) @ \section{Introduction} Data were collected on \emph{Echinacea angustifolia}. These data are in the \texttt{echin2} dataset in the \texttt{aster} contributed package to the R statistical computing environment. The data set is based on 557 Echinacea angustifolia plants that were planted as sprouts in a growth chamber. Seedling survivors were then transplanted to an experimental garden. The components of fitness are the variables in the graphical model shown in Figure~\ref{fig:graph:echinacea}. \begin{figure} \begin{center} \setlength{\unitlength}{0.566 in} \begin{picture}(8.2,1.50)(-0.05,-1.25) \put(0,0){\makebox(0,0){$1$}} \put(1,0){\makebox(0,0){\ttfamily lds1}} \put(2,0){\makebox(0,0){\ttfamily lds2}} \put(3,0){\makebox(0,0){\ttfamily lds3}} \put(4,0){\makebox(0,0){\ttfamily ld01}} \put(5,0){\makebox(0,0){\ttfamily ld02}} \put(6,0){\makebox(0,0){\ttfamily ld03}} \put(6,-1){\makebox(0,0){\ttfamily r03}} \put(7,0){\makebox(0,0){\ttfamily ld04}} \put(7,-1){\makebox(0,0){\ttfamily r04}} \put(8,0){\makebox(0,0){\ttfamily ld05}} \put(8,-1){\makebox(0,0){\ttfamily r05}} \multiput(0.35,0)(1,0){8}{\vector(1,0){0.3}} \multiput(6,-0.25)(1,0){3}{\vector(0,-1){0.5}} \end{picture} \end{center} \caption{Graph for \emph{Echinacea angustifolia} data. Arrows go from one life history component to another indicating conditional dependence in the aster model. Nodes are labeled by their associated variables. Root nodes are associated with the constant variable $1$, indicating presence of individuals at the outset. If any parent variable is zero, then the child variable is also zero. Child variables are conditionally independent given the parent variable. If a parent variable is nonzero, then the conditional distribution of the child variable is as follows. \texttt{lds}$i$ and \texttt{ld0}$i$ are (conditionally) Bernoulli (zero indicates mortality, one indicates survival) and \texttt{r0}$i$ is (conditionally) zero-truncated Poisson.} \label{fig:graph:echinacea} \end{figure} All response variables are collected in the response vector \texttt{resp} in the data frame \texttt{echin2}. Components of the response vector corresponding to the same individual have the same value of the \texttt{id} variable in \texttt{echin2}. Components of the response vector corresponding to the same node of the graphical model (to the same ``original variable'') have the same value of the \texttt{varb} variable in \texttt{echin2}. The levels of \texttt{varb}, the ``original variables'' are as follows. Variables \texttt{lds}$i$ measure survival of individuals in a growth chamber (periods are months). Variables \texttt{ld0}$i$ measure survival of individuals in an experimental field plot after transplanting (periods are years). Variables \texttt{r0}$i$ count number of rosettes (basal leaf clusters), which are a surrogate of fitness. The names use here are shortened from those in the dataset where they are \texttt{roct200}$i$. Individuals resulted from crosses in which (a) mates were from different remnant populations, (b) mates were chosen at random from the same remnant, or (c) mates shared their maternal parent (variable \texttt{crosstype}). Other variables in the data set measure location in the growth chamber (\texttt{flat}) or in the field plot (\texttt{posi} and \texttt{row}) and year of crossing (1999 or 2000, variable \texttt{yearcross}). This data set was challenging because the covariate \texttt{flat} only makes sense in relation to response variables in the growth chamber (\texttt{lds}$i$) and the covariates \texttt{posi} and \texttt{row} only make sense in relation to response variables in the field plot (\texttt{ld0}$i$ and \texttt{r0}$i$). The R formula mini-language is not designed to handle this sort of situation. Thus model matrices must be constructed ``by hand.'' Growth chamber is incorrectly referred to as ``greenhouse'' in the rest of this chapter. \section{Data} Load the data. Look at number of variables, their names and types. <>= library(aster) data(echin2) names(echin2) levels(echin2$varb) @ \section{Set Up Aster Model} <>= vars <- c("lds1", "lds2", "lds3", "ld01", "ld02", "ld03", "roct2003", "ld04", "roct2004", "ld05", "roct2005") pred <- c(0,1, 2, 3, 4, 5, 6, 6, 8, 8, 10) fam <- c(1, 1, 1, 1, 1, 1, 3, 1, 3, 1, 3) fam.default()[fam] @ Variable \verb@vars@ gives the names of the variables in the data that are components of the aster response vector. Variable \verb@pred@ gives the graphical model: \verb@pred[i]@ gives the index of the parent variable of variable \verb@i@ or zero if the parent is a root node. Variable \verb@fam@ specifies families. The original numeric code, now superseded, is 1 = Bernoulli, 2 = Poisson, 3 = zero-truncated Poisson. For backwards compatibility, these three models are still specified by the new function \verb@fam.default@. Count individuals and nodes <>= nind <- length(unique(echin2$id)) nnode <- length(levels(echin2$varb)) @ \section{Hand Crafted Model Matrices} \subsection{Aster Models} We construct model matrices (which are really three-way arrays in aster) by hand. The sad fact is that the R formula mini-language hardly qualifies as a language. It has minimal syntax and very little generality. It is just not up to specifying the models we want to fit. However every aster model --- a canonical affine model, as the accepted version of the \emph{Biometrika} paper calls it --- is specified by having an affine predictor of the form \begin{equation} \label{eq:model} \tag{${*}$} \eta = a + M \beta \end{equation} which is equation (8) in the paper with the left-hand side, which is $\varphi$ in the paper, replaced by $\eta$, which is our notation for a general canonical parameter (either unconditional $\varphi$ or conditional $\theta$, as the case may be). In this document we are using unconditional models so we could have left (8) as it is in the paper with the canonical parameter denoted $\varphi$, but we strive for generality. Equation \eqref{eq:model} is a vector equation. Variable $\eta$ is a vector whose length is the total number of nodes in the whole graph. The notion of the graph changed from the first draft of the paper to the third in response to the referee's comments. In the first draft (and this is what the \verb@aster@ package still implements). The graph specified by the vector \verb@pred@ has, call it \verb@nnode@ nodes. But this graph is repeated for, call it \verb@nind@ individuals. In the second and third drafts of the paper, individuals are invisible. The graph new sense is \verb@nind@ identical copies of the old-sense graph. Hence the new sense graph has \verb@nind * nnode@ nodes. The reasons for this change are two: there is no reason for the restriction to identical copies, the theory working perfectly well when individuals have different graphs and the combined graph is whatever it is, and the notation is simpler, the model matrices really being three-way arrays no longer being necessary (model matrices are really matrices in the version to appear in \emph{Biometrika}, but the \verb@aster@ package is still stuck in the old notation). Now \eqref{eq:model}, which is new sense, is a vector equation, so the dimension of each term must have dimension \verb@nind * nnode@. That means $\eta$ has this dimension, so does the known ``origin'' vector $a$, and so does the term $M \beta$, which is a matrix multiplication, $M$ being a matrix, the \emph{model matrix} and $\beta$ being a vector of regression coefficients. Say the length of $\beta$ is \verb@ncoef@. Then the row dimension of $M$ must be \verb@nind * nnode@ and the column dimension of $M$ must be \verb@ncoef@. We do not have to worry about specifying the ``origin'' vector $a$. Usually $M$ contains $a$ in its range space and that means that the fitted mean value parameters do not depend on $a$, although the regression coefficients themselves are different, yet another reason regression coefficients are meaningless. Our job is to construct the $M$ that specifies the model we want. The R formula mini-language does this automagically in simple cases. In cases too complicated for the stupid computer to understand (and the R formula mini-language has only rudimentary knowledge of statistical modeling), we have to just do it ourselves. After we have constructed the new sense model matrix $M$, then we must reshape it to use it with the \verb@aster@ function. With \verb@nind@, \verb@nnode@, and \verb@ncoef@ defined as above, the following code \begin{verbatim} mold <- array(as.numeric(mnew), c(nind, nnode, ncoef)) \end{verbatim} does this reshaping, turning the new sense model matrix \verb@mnew@, which is really a matrix, into the old sense model matrix \verb@mold@, which is really a three way array. For each \verb@k@, the column \verb@mnew[ , k]@ of the new sense model matrix corresponding to one regression coefficient (one ``predictor vector'' in the regression jargon) corresponds to a matrix \verb@mold[ , , k]@ which has dimension \verb@nind@ by \verb@nnode@. Our strategy will be to work with new sense model matrices, since they are simpler, being really honest-to-God matrices, and since they correspond to the paper to appear anyway. Only at the end, just before they are fed into the \verb@aster@ function, will we reshape them to three-way arrays. We also need to reshape the data (response and root) in the same fashion <>= x <- echin2$resp dim(x) <- c(nind, nnode) r <- 0 * x + 1 @ \subsection{Constructing One Model Matrix} Aster model matrices are just like any other model matrix used in regression. Theoretically they can be any matrix of the required dimension. In practice, they often have many columns that are zero-or-one valued. The columns that are zero-or-one valued are often called ``dummy'' predictor variables. Their effect is to add a constant, the corresponding regression coefficient $\beta_k$ to the affine predictor $\eta_i$ for the individuals having a one in $m_{i k}$. So this just puts in an additive term for individuals in a certain class (the class indicated by the dummy variable thought of as an indicator variable). We have just one quantitative variable \verb@posi@. Everything else is qualitative and corresponds to one or more dummy variables. \subsubsection{One Categorical Variable} We start building the model matrix for the largest model we will consider (call it the ``supermodel'') as follows. <>= modmat.super <- NULL names.super <- NULL for (i in levels(echin2$varb)) { modmat.super <- cbind(modmat.super, as.numeric(echin2$varb == i)) names.super <- c(names.super, i) } @ It is a standard S trick to start a recursion with an empty object \verb@NULL@, which will act as a vector with no elements or a matrix with no columns. Each trip through the loop adds one column to the model matrix and a corresponding label to what will eventually be the column names for the model matrix. Note that we don't bother to construct regression coefficient labels that are exactly the same as those constructed by the R formula mini-language. Those are ridiculously verbose, done by a computer that is really very stupid. If anyone doesn't like these labels, they can just change the way \verb@names.super@ is defined. Here we add one dummy variable for each element of <>= levels(echin2$varb) @ what we usually call the response variables, but that terminology does not fit well with aster terminology, which says there are really \verb@nrow(echin2)@ or \verb@nind * nnode@ response variables. Whatever we call them, we have now added one dummy variable for each one of them. Thus each of these will have an independently fitted level (since each corresponds to a dummy variable). \subsubsection{A Definition} Next we define an indicator variable that indicates being in the greenhouse. <>= in.greenhouse <- is.element(echin2$varb, grep("lds", levels(echin2$varb), value = TRUE)) print(unique(echin2$varb[in.greenhouse]), max.levels = 0) print(unique(echin2$varb[! in.greenhouse]), max.levels = 0) @ \subsubsection{Another Categorical Variable} Now we have another loop that adds one column per trip through the loop. <>= for (i in levels(echin2$flat)) if (i > "1") { modmat.super <- cbind(modmat.super, as.numeric(in.greenhouse & echin2$flat == i)) names.super <- c(names.super, paste("flat", i, sep = "")) } @ Here we add one column for each of <>= levels(echin2$flat) @ that are greater than \verb@"1"@ in the sort order. The reason for dropping one of the dummy variables is well known and taught in every regression class. The vector sum of all dummy variables corresponding to one categorical variable (in this case \verb@echin2$varb@) is a vector of all ones, since each observation falls in exactly one category. Since the same holds for every categorical variable, we must do one of the following in order to construct a model matrix that is not rank deficient. \begin{itemize} \item Drop one dummy variable from each group of dummy variables corresponding to one categorical variable. Then add a dummy variable corresponding to no categorical variable that is a column of all ones, a so-called ``intercept'' dummy variable. \item Drop one dummy variable from each group of dummy variables corresponding to one categorical variable, except for one group for which we keep them all. \end{itemize} Here we follow the latter strategy. We have no ``intercept'' and do not drop any of the dummy variables made in the first loop, but drop one from each loop thereafter. Note that we do deal with the in-and-out-of-greenhouse issue explicitly and hence do what is obviously the Right Thing. The dummy variables we construct have a one if and only if the item in question has \verb@in.greenhouse@ equal to \verb@TRUE@ and \verb@echin2$flat@ equal to the value we are constructing a dummy variable for. \subsubsection{Yet Another Categorical Variable} Now we have another loop that adds one column per trip through the loop. <>= for (i in levels(echin2$row)) if (i > "10") { modmat.super <- cbind(modmat.super, as.numeric((! in.greenhouse) & echin2$row == i)) names.super <- c(names.super, paste("row", i, sep = "")) } @ Here we add one column for each of <>= levels(echin2$row) @ that are greater than \verb@"10"@ in the sort order. This is very similar to the preceding loop except the level \verb@"0"@ is both bogus, a forlorn attempt to deal with the in-and-out of greenhouse issue. We must drop one dummy variable besides the bogus one. Again we deal with the in-and-out-of-greenhouse issue explicitly and obviously do the Right Thing. The dummy variables we construct have a one if and only if the item in question has \verb@in.greenhouse@ equal to \verb@FALSE@ and \verb@echin2$row@ equal to the value we are constructing a dummy variable for. \subsubsection{And Another Categorical Variable} Now we have another loop that adds one column per trip through the loop. <>= for (i in levels(echin2$yearcross)) if (i >= "2000") { modmat.super <- cbind(modmat.super, as.numeric(echin2$yearcross == i)) names.super <- c(names.super, paste("yc", i, sep = "")) } @ Here we add one column for each of <>= levels(echin2$yearcross) @ that are greater than or equal to \verb@"2000"@ in the sort order. This is very similar to the preceding loops, but simpler. There is no bogus level to deal with and there is no in-and-out-of-greenhouse issue. The fact that the body of the if statement only executes once and we only add one dummy variable (one column of the model matrix) is not a problem. It just works. \subsubsection{Finally, a Quantitative Variable} No loop is needed to construct a predictor variable that is quantitative. We just make it a column of the model matrix. <>= modmat.super <- cbind(modmat.super, as.numeric(! in.greenhouse) * echin2$posi) names.super <- c(names.super, "posi") @ We do have to deal with the in-and-out-of-greenhouse issue. The Right Thing is to make the quantitative variable zero when it has no effect, because that makes zero contribution to the canonical parameter, which is obviously what is wanted. One might think that this ``locates'' all the in-greenhouse variables at \verb@posi@ equal to zero, but this is an illusion. The other dummy variables that in-greenhouse variables have give them independently fitted levels (corresponding to their regression coefficients), so there is no problem. \subsubsection{Interaction of Crosstype and Final Response in the Field} % \subsection{Constructing a Supermodel Matrix} We have another loop that adds, again one column per trip through the loop. <>= for (i in levels(echin2$crosstype)) if (i > "W") { modmat.super <- cbind(modmat.super, as.numeric(echin2$crosstype == i & echin2$varb == "roct2005")) names.super <- c(names.super, paste("cross", i, sep = "")) } @ Here we add one column for each of <>= levels(echin2$crosstype) @ that are greater than \verb@"W"@ in the sort order. Here we do something very tricky, only the same sort of trickiness involved in the \verb@- pop@ in the model formulae in the \emph{Biometrika}, but even that was clear as mud, and when combined with the in-and-out-of-greenhouse issue, the R formula mini-language is just not up to the task. The dummy variable(s) we add, \Sexpr{sum(levels(echin2$crosstype) > "W")} of them, have a one if and only if the item in question has \verb@echin2$varb@ equal to \verb@"roct2005"@ and \verb@echin2$crosstype@ equal to the value we are constructing a dummy variable for. \subsubsection{Interaction of Crosstype and Final Response in the Greenhouse} % \subsection{Constructing a Super-Supermodel Matrix} The columns we add here are just like those added in the preceding section except they involve the last greenhouse variable. <>= for (i in levels(echin2$crosstype)) if (i > "W") { modmat.super <- cbind(modmat.super, as.numeric(echin2$crosstype == i & echin2$varb == "lds3")) names.super <- c(names.super, paste("crossgreen", i, sep = "")) } @ \subsection{Reshape This Matrix} <>= nodename <- unique(as.character(echin2$var)) modmat.super <- array(as.vector(modmat.super), c(dim(x), length(names.super))) dimnames(modmat.super) <- list(NULL, nodename, names.super) @ The \verb@array@ function constructs arrays. We now assign appropriate dimnames using the \verb@names.super@ that we constructed as we constructed the matrix. \subsection{Extract Submatrices} Now we extract three submatrices of this supermodel matrix: one having only the field-crosstype effect, one having only the greenhouse-crosstype effect, and one having neither. <>= ifield <- grep("crossW", names.super) names.super[ifield] igreen <- grep("crossgreenW", names.super) names.super[igreen] modmat.field <- modmat.super[ , , -igreen] modmat.green <- modmat.super[ , , -ifield] modmat.sub <- modmat.super[ , , -c(ifield, igreen)] @ \section{Model Fits and Hypothesis Tests} \subsection{Fit Models} Fit aster models. Although it is not obvious from the syntax, the function \verb@aster@ is generic with two methods \verb@aster.formula@ and \verb@aster.default@. If the first argument is a formula the former is used. If not, as here, the latter is used. <>= out.sub <- aster(x, r, pred, fam, modmat.sub) summary(out.sub) @ \pagebreak[3] <>= out.field <- aster(x, r, pred, fam, modmat.field) summary(out.field) @ \pagebreak[3] <>= out.green <- aster(x, r, pred, fam, modmat.green) summary(out.green) @ \pagebreak[3] <>= out.super <- aster(x, r, pred, fam, modmat.super) summary(out.super) @ \subsection{Tests of Model Comparison} \begin{figure} \begin{center} \setlength{\unitlength}{0.5 in} \begin{picture}(3,3)(1,1) \put(2.5,3.5){\makebox(0,0){\tt out.super}} \put(1.5,2.5){\makebox(0,0){\tt out.field}} \put(3.5,2.5){\makebox(0,0){\tt out.green}} \put(2.5,1.5){\makebox(0,0){\tt out.sub}} \put(2.25,3.25){\vector(-1,-1){0.5}} \put(2.75,3.25){\vector(1,-1){0.5}} \put(1.75,2.25){\vector(1,-1){0.5}} \put(3.25,2.25){\vector(-1,-1){0.5}} \end{picture} \end{center} \caption{Relationship of Aster Models. Arrows go from one model to another that is a nested submodel of it.} \label{fig:graph} \end{figure} Figure~\ref{fig:graph} shows the relationship between these models. The only models that are not nested and so cannot be directly compared are \verb@out.field@ and \verb@out.green@ (although something can be said from the comparison of each to the models above and below). The following \verb@anova@ commands do the four tests associated with the four arrows in Figure~\ref{fig:graph}. <>= anova(out.sub, out.field, out.super) anova(out.sub, out.green, out.super) @ <>= tout.sub.field <- anova(out.sub, out.field) tout.sub.field.pval <- tout.sub.field[["P(>|Chi|)"]][2] tout.sub.field.pval.expo <- floor(log10(tout.sub.field.pval)) tout.sub.field.pval.frac <- tout.sub.field.pval / 10^tout.sub.field.pval.expo tout.sub.green <- anova(out.sub, out.green) tout.sub.green.pval <- tout.sub.green[["P(>|Chi|)"]][2] tout.green.super <- anova(out.green, out.super) tout.green.super.pval <- tout.green.super[["P(>|Chi|)"]][2] tout.green.super.pval.expo <- floor(log10(tout.green.super.pval)) tout.green.super.pval.frac <- tout.green.super.pval / 10^tout.green.super.pval.expo tout.field.super <- anova(out.field, out.super) tout.field.super.pval <- tout.field.super[["P(>|Chi|)"]][2] @ We reach the following conclusions of a purely statistical nature (there is no point in scientific interpretations until the statistics is done). First look at the right-hand side of Figure~\ref{fig:graph}. We see that, although the model with crosstype effect in the greenhouse only (\verb@out.green@) does fit significantly better ($P = \Sexpr{round(tout.sub.green.pval, 3)}$) than the baseline model with no crosstype effect, it does not fit as well ($P = \Sexpr{round(tout.green.super.pval.frac, 1)} \times 10^{\Sexpr{tout.green.super.pval.expo}}$) as the supermodel with both crosstype effects. Thus looking at the right-hand side of Figure~\ref{fig:graph} only, the hypothesis tests of model comparison indicate that the supermodel (\verb@out.super@) is the only model that fits the data. Now we look at the left-hand side of Figure~\ref{fig:graph}. We see that the model with crosstype effect in the field only (\verb@out.field@) not only fits significantly better ($P = \Sexpr{round(tout.sub.field.pval.frac, 1)} \times 10^{\Sexpr{tout.sub.field.pval.expo}}$) than the baseline model with no crosstype effect but also does not fit significantly worse ($P = \Sexpr{round(tout.field.super.pval, 2)}$) than the supermodel with both crosstype effects. Thus looking at the left-hand side of Figure~\ref{fig:graph} only, the hypothesis tests of model comparison indicate that the middle model (\verb@out.field@) is the most parsimonious model that fits the data (the super model also fits the data but no better than \verb@out.field@). Now looking at all the tests, the overall conclusion is that (\verb@out.field@) is the most parsimonious model that fits the data. \section{Mean Value Parameters} \subsection{In the Field} We need to make a new model matrix for hypothetical individuals having the same data structure as \verb@modmat.super@. We want three individuals, one for each value of ``cross.'' The easiest way to do this is to just use the first three ``rows'' (really layers of a three-way array) of \verb@modmat.field@ and adjust the predictors to be what we want. <>= newmodmat.super <- modmat.super[1:3, , ] i <- grep("ld0[1-5]|lds[1-3]|roct200[3-5]", dimnames(newmodmat.super)[[3]]) newmodmat.super[ , , -i] <- 0 newmodmat.super[2, "roct2005", "crossWi"] <- 1 newmodmat.super[3, "roct2005", "crossWr"] <- 1 newmodmat.super[2, "lds3", "crossgreenWi"] <- 1 newmodmat.super[3, "lds3", "crossgreenWr"] <- 1 @ We set all ``predictor'' values except for those corresponding to response variables to zero (which we take to be a ``neutral'' or ``typical'' value) and then set the ones for cross type back to what we want them to be: individual 2 is cross type \verb@"Wi"@, individual 3 is cross type \verb@"Wr"@, and individual 1 is the remaining cross type \verb@"Br"@. Now we extract the other new model matrices just like we did the old. <>= newmodmat.field <- newmodmat.super[ , , -igreen] newmodmat.green <- newmodmat.super[ , , -ifield] newmodmat.sub <- newmodmat.super[ , , -c(ifield, igreen)] @ Now we need root data and observed data of the appropriate shape to go with this model matrix, since the observed ``x'' data is ignored when we predict unconditional mean value parameters, we can use the same data structure for both. <>= newroot <- array(1, dim = dim(newmodmat.field)[1:2]) @ We also need an \verb@amat@ argument that picks off the \verb@"roct2005"@ elements of the mean value parameter vector. <>= amat <- array(0, dim = c(dim(newmodmat.field)[1:2], 3)) for (i in 1:3) amat[i, dimnames(newmodmat.field)[[2]] == "roct2005", i] <- 1 @ This says we want the \verb@i@-th component of the prediction to be the unconditional mean value parameter for \verb@"roct2005"@ for the \verb@i@-th individual. Now we can predict <>= pout.super <- predict(out.super, newroot, newroot, newmodmat.super, amat, se.fit = TRUE) pout.super$fit pout.field <- predict(out.field, newroot, newroot, newmodmat.field, amat, se.fit = TRUE) pout.field$fit pout.green <- predict(out.green, newroot, newroot, newmodmat.green, amat, se.fit = TRUE) pout.green$fit pout.sub <- predict(out.sub, newroot, newroot, newmodmat.sub, amat, se.fit = TRUE) pout.sub$fit @ Figure~\ref{fig:one} is produced by the following code <>= modelfits <- list(sub = pout.sub, green = pout.green, field = pout.field, super = pout.super) @ <>= conf.level <- 0.95 crit <- qnorm((1 + conf.level) / 2) crossnames <- levels(echin2$crosstype) icross <- c(1, 3, 2) foo <- 0.05 modelcolors <- c("blue", "green", "red", "black") modellty <- c(3, 2, 1, 4) ylim <- NULL for (i in seq(along = modelfits)) { thefit <- modelfits[[i]] ylim <- range(ylim, thefit$fit + crit * thefit$se.fit) ylim <- range(ylim, thefit$fit - crit * thefit$se.fit) } xlim <- range(icross + 10 * foo, icross - 10 * foo) plot(NA, NA, xlim = xlim, ylim = ylim, xlab = "", ylab = "", axes = FALSE) for (i in seq(along = modelfits)) { thefit <- modelfits[[i]] ytop <- thefit$fit + crit * thefit$se.fit ybot <- thefit$fit - crit * thefit$se.fit ymid <- thefit$fit col <- modelcolors[i] lty <- modellty[i] jcross <- icross + (i - mean(seq(along = modelfits))) * 3 * foo segments(jcross, ybot, jcross, ytop, lty = lty) segments(jcross - foo, ybot, jcross + foo, ybot, lty = lty) segments(jcross - foo, ymid, jcross + foo, ymid, lty = lty) segments(jcross - foo, ytop, jcross + foo, ytop, lty = lty) } axis(side = 2) axis(side = 1, at = icross, labels = crossnames) title(xlab = "cross type") @ and appears on p.~\pageref{fig:one}. \begin{figure} \begin{center} <>= <> title(ylab = "unconditional expected rosette count") @ \end{center} \caption{\Sexpr{100 * conf.level}\% confidence intervals for unconditional mean value parameter for fitness (rosette count in the last year recorded) for one ``typical'' individual for each cross type. Colors indicate model: blue, \texttt{model.sub}; green, \texttt{model.green}; red, \texttt{model.field}; black, \texttt{model.super}.} \label{fig:one} \end{figure} \subsection{In the Greenhouse} \label{sec:green-mean} This section is very similar to the preceding. The only difference is that we want to predict \verb@lds3@ values rather than \verb@roct2005@ values. We also need an \verb@amat@ argument that picks off the \verb@"lds3"@ elements of the mean value parameter vector. <>= amat <- array(0, dim = c(dim(newmodmat.field)[1:2], 3)) for (i in 1:3) amat[i, dimnames(newmodmat.field)[[2]] == "lds3", i] <- 1 @ Now we can predict as before <>= <> modelfits.field <- modelfits modelfits <- list(sub = pout.sub, green = pout.green, field = pout.field, super = pout.super) modelfits.green <- modelfits @ Figure~\ref{fig:two} is now produced by the same code as Figure~\ref{fig:one}. and appears on p.~\pageref{fig:two}. \begin{figure} \begin{center} <>= <> title(ylab = "unconditional expected survival") @ \end{center} \caption{\Sexpr{100 * conf.level}\% confidence intervals for unconditional mean value parameter for the best surrogate for fitness measured in the greenhouse (survival in the last year recorded) for one ``typical'' individual for each cross type. Just as in Figure~\ref{fig:one}, colors indicate model: blue, \texttt{model.sub}; green, \texttt{model.green}; red, \texttt{model.field}; black, \texttt{model.super}.} \label{fig:two} \end{figure} Note that Figure~\ref{fig:two} shows a similar pattern to Figure~\ref{fig:one}. Even though the mean value parameters predicted in Figure~\ref{fig:one} correspond to canonical parameter values that contain the regression coefficients that differ between the two models and the mean value parameters predicted in Figure~\ref{fig:two} do not. So the failure of model \verb@out.super@ to be statistically significant does not mean that there are no fitness effects in the greenhouse; it merely means that what effects there are are adequately explained by an unconditional aster model with no additional parameters. \subsection{Plot for Paper} Here we assemble the two preceding plots into one postscript file (Figure~\ref{fig:pap}). \begin{figure} \begin{center} <>= par(mfrow = c(1, 2), mar = c(5,4,4,0.25) + 0.1) modelfits <- modelfits.green <> title(ylab = "unconditional expected survival") box() u <- par("usr") text(0.8 * u[1] + 0.2 * u[2], 0.8 * u[3] + 0.2 * u[4], "A", adj = 0.5, cex = 2) modelfits <- modelfits.field <> title(ylab = "unconditional expected rosette count") box() u <- par("usr") text(0.8 * u[1] + 0.2 * u[2], 0.8 * u[3] + 0.2 * u[4], "B", adj = 0.5, cex = 2) @ \end{center} \caption{Plots for paper. Line types: dotted ``sub'' model, dashed ``chamber'' model, solid ``field'' model, and dot-dash ``super'' model.} \label{fig:pap} \end{figure}