forstep(i,H,E,fh,fe), integer i > 0, fh > 0, fe > 0, REAL symmetric matrices H and E with no MISSING values |

NOTE: This macro is OBSOLETE and is retained only for backward compatibility because it was in file MacAnova.mac in earlier versions of MacAnova. For doing stepwise variable selection in discriminant analysis you should use newer macros dastepsetup(), daentervar(), daremovevar(), dastepstatus() and dasteplook(). Macro forstep() performs a variable inclusion step in forward stepwise variable selection in linear discriminant analysis. forstep() is intended to be used after you have used manova("y = groups"), where y is a data matrix and groups is a factor, to compute hypothesis and error matrices H = matrix(SS[2,,]) and E = matrix(SS[3,,]), with fh = DF[2] and fe = DF[3] degrees of freedom respectively. Status information about the variables currently "in" and "out' is maintained in integer vectors INS and OUTS containing numbers of variables currently included and currently excluded. When no variables are "in", INS = 0; when all variables are "in", OUTS = NULL. INS must be initialized, usually to 0, before forstep() can be used. forstep(j,H,E,fh,fe), where j is the number of a variable not currently "in", adds j to INS and removes it from OUTS, and then uses macro compf() to compute F-to-enter for all variables included in the updated INS. The Fs-to-enter are the analysis of covariance Fs for each "out" variable, with the "in" variables being used as covariates. See topic compf(). When no variables are "in", the Fs-to-enter are the ordinary ANOVA F-statistics for each variable. The value returned (which will normally be printed if not assigned) is structure(f:F_to_enter, df:vector(fh,fe-k), ins:INS,outs:OUTS), where F_to_enter is the vector of F-to-enter statistics, one for each variable not in INS, INS and OUTs are copies of the status vectors INS and OUTS. k is the number of variables currently "in". The F-to-enter statistics have nominal degrees of freedom fh and fe - k. The next variable to be entered, if any, is normally the variable with the largest F-to-enter. The decision to enter it is based on the size of F-to-enter. You can somewhat automate the start of this process as follows: Cmd> manova("y = groups", silent:T) # response matrix y, factor groups Cmd> H <- matrix(SS[2,,]); E <- matrix(SS[3,,]) Cmd> fh <- DF[2]; fe <- DF[3] Cmd> INS <- 0; stuff <- compf(H,E,fh,fe) Cmd> stuff <- forstep(OUTS[grade(stuff$f,down:T)[1]],H,E,fh,fe); The last step can be repeated to bring "in" variables. Of course, in practice, you want to examine the computed F-to-enter statistics to see if another variable *should* be entered. You can do a backward step (variable deletion) using macro backstep(). One difference between backstep() and forstep() is that backstep() determines the variable to eliminate, and then updates INS and OUTS; you must tell forstep() which variable to include. See backstep() for details. See also compf() which computes F-to-enter for variables not in INS. Both backstep() and compf() are OBSOLETE and are retained only for backward compatibility. See also manova(), daentervar(), daremovevar(), dastepsetup(), dastepstatus() and dasteplook().

Gary Oehlert 2003-01-15