Next: glscrit() Up: Multivariate Macros Help File Previous: facanal()   Contents

forstep()

Usage:
forstep(i,H,E,fh,fe), integer i > 0, fh > 0, fe > 0, REAL symmetric
  matrices H and E with no MISSING values



Keywords: factor analysis, iteration
NOTE: This macro is OBSOLETE and is retained only for backward
compatibility because it was in file MacAnova.mac in earlier versions of
MacAnova.  For doing stepwise variable selection in discriminant
analysis you should use newer macros dastepsetup(), daentervar(),
daremovevar(), dastepstatus() and dasteplook().

Macro forstep() performs a variable inclusion step in forward stepwise
variable selection in linear discriminant analysis.

forstep() is intended to be used after you have used manova("y =
groups"), where y is a data matrix and groups is a factor, to compute
hypothesis and error matrices H = matrix(SS[2,,]) and E =
matrix(SS[3,,]), with fh = DF[2] and fe = DF[3] degrees of freedom
respectively.

Status information about the variables currently "in" and "out' is
maintained in integer vectors INS and OUTS containing numbers of
variables currently included and currently excluded.  When no variables
are "in", INS = 0; when all variables are "in", OUTS = NULL.  INS must
be initialized, usually to 0, before forstep() can be used.

forstep(j,H,E,fh,fe), where j is the number of a variable not currently
"in", adds j to INS and removes it from OUTS, and then uses macro
compf() to compute F-to-enter for all variables included in the updated
INS.  The Fs-to-enter are the analysis of covariance Fs for each "out"
variable, with the "in" variables being used as covariates.  See topic
compf().  When no variables are "in", the Fs-to-enter are the ordinary
ANOVA F-statistics for each variable.

The value returned (which will normally be printed if not assigned) is
structure(f:F_to_enter, df:vector(fh,fe-k), ins:INS,outs:OUTS), where
F_to_enter is the vector of F-to-enter statistics, one for each
variable not in INS, INS and OUTs are copies of the status vectors INS
and OUTS.  k is the number of variables currently "in".

The F-to-enter statistics have nominal degrees of freedom fh and fe -
k.  The next variable to be entered, if any, is normally the variable
with the largest F-to-enter.  The decision to enter it is based on the
size of F-to-enter.

You can somewhat automate the start of this process as follows:

  Cmd> manova("y = groups", silent:T) # response matrix y, factor groups

  Cmd> H <- matrix(SS[2,,]); E <- matrix(SS[3,,])

  Cmd> fh <- DF[2]; fe <-  DF[3]

  Cmd> INS <- 0; stuff <- compf(H,E,fh,fe)

  Cmd> stuff <- forstep(OUTS[grade(stuff$f,down:T)[1]],H,E,fh,fe);

The last step can be repeated to bring "in" variables.  Of course, in
practice, you want to examine the computed F-to-enter statistics to see
if another variable *should* be entered.

You can do a backward step (variable deletion) using macro backstep().
One difference between backstep() and forstep() is that backstep()
determines the variable to eliminate, and then updates INS and OUTS; you
must tell forstep() which variable to include.  See backstep() for
details.  See also compf() which computes F-to-enter for variables not
in INS.

Both backstep() and compf() are OBSOLETE and are retained only for
backward compatibility.

See also manova(), daentervar(), daremovevar(), dastepsetup(),
dastepstatus() and dasteplook().


Gary Oehlert 2003-01-15