Next: glm_keys Up: MacAnova Help File Previous: getusage() Contents
glm

Keywords: glm, anova, categorical data, multivariate analysis, regression
The commands for analyzing linear and generalized linear models are as
follows:
   anova(), fastanova()              Analysis of Variance
   glmfit()                          Generalized linear model analysis
   ipf(), logistic()                 Logistic Regression
   manova(),                         Multivariate Analysis of Variance
   poisson()                         Log linear models
   probit()                          Probit analysis
   regress()                         Linear Regression
   robust()                          Robust Regression
   screen()                          Best subset linear regression

These are generally referred to as GLM commands in help topics.  See
their individual help entries for details.  Type help(key:"glm") for a
list of help entries related to analyzing linear and generalized linear
models.

In addition, wtanova(), wtmanova() and wtregress() do weighted ANOVA,
MANOVA and regression.  Since the same computations are done when
weights are specified using keyword 'weights' or 'wts' (see below),
these are not further mentioned here.

Function glmfit() is a general function that can, with appropriate
keyword arguments, be used instead of anova(), logistic(), poisson(),
and probit().  In the future, additional options will allow analyses not
possible at present.

All GLM commands have certain elements in common.

  The first argument of a GLM command specifies a model as a quoted
  string or CHARACTER variable.  Examples are regress("y=x1+x2+x3") and
  anova("x=a + a.b").  If the model is absent (for example, anova() or
  logistic(,n)) the most recent GLM model is assumed or the model in
  CHARACTER variable STRMODEL is used.  Type help(models) for
  information on how to specify a model.

  When there are MISSING values in any of the variables in a GLM model,
  any case with any MISSING values is omitted entirely.  The maximum
  level of any factor is taken to be the maximum level on any of the
  complete data cases.

  All GLM commands but screen() create certain side-effect variables.
  The most important are the following (not all may be produced by every
  command).

   STRMODEL, a CHARACTER scalar containing the model used.

   TERMNAMES, a CHARACTER vector containing the names of the terms in
   the model including the error terms.  When the GLM command does an
   iterative fit without keyword phrase 'inc:T' (see topic 'glm_keys'),
   the value of TERMNAMES still has the same number of elements but has
   the form vector("","",...,"Overall model","ERROR1"), reflecting the
   fact that only model and error deviances are computed.

   DEPVNAME, a CHARACTER scalar containing the name of the response
   variable in the model.

   SS, a REAL vector of sums of squares or deviances, one for each term
   in the model.  For manova() this is an array of SSCP matrices, with
   the first subscript indexing the term.  Except when 'marginal:T' is
   an argument to anova(), manova() or robust(), these are computed
   sequentially and measure the importance of a term after fitting
   previous terms, and ignoring later terms.  The first dimension of SS
   has labels identical to TERMNAMES.  After manova(), dimensions 2 and
   3 are labeled with the column labels of the response variable if it
   has labels or by vector("(1)","(2)", ...) otherwise.  After a GLM
   command that does an iterative fit without keyword phrase 'inc:T',
   the value of SS is vector(0,0,...,ModelDeviance,ErrorDeviance).

   DF, a REAL vector containing the degrees of freedom associated with
   each term in the model.  After a GLM command that does an iterative
   fit without keyword phrase 'inc:T', the value of DF is
   vector(0,0,...,ModelDF,ErrorDF).

   RESIDUALS, a REAL vector or matrix of residuals from the fitted
   model.  For any case with MISSING values in the data, RESIDUALS is
   MISSING.

   WTDRESIDUALS, a REAL vector or matrix of weighted residuals from the
   fitted model.  For analyses using iteratively re-weighted least
   squares sucu as logistic(), probit(), or poisson(), the weights are
   those used on the last iteration.  For any case with MISSING values
   in the data, WTDRESIDUALS is MISSING.  WTDRESIDUALS is not created by
   anova(), regress() or manova() unless weights are provided.

   XTXINV (regress()), the inverse or generalized inverse of X'X or
   X'WX, where X is the n by k matrix of predictors, including the
   constant vector if it is in the model, and W is the diagonal matrix
   of weights, if any.

   HII, the REAL vector of leverages, the diagonal elements of
   X(XTXINV)X' or W X(XTXINV)X', where W is the diagonal matrix of
   weights, if any.

   COEF (regress() only), the model coefficients.

It is an error if any GLM command finds that any side effect variable is
locked (see lockvars(), unlockvars(), 'variables:"locked_variables"').

Besides creating side effect variables, most GLM commands save "private"
information about the analysis.  This is used by commands such as
regpred(), contrast(), coefs() and secoefs().  It can be retrieved by
command modelinfo().  This information is not preserved by save() and
asciisave() unless keyword phrase 'all:T' is used.  It is discarded when
you assign a value to STRMODEL or delete STRMODEL.

See topic 'glm_keys' for a list of keyword phrases recognized by more
than one GLM command.
Gary Oehlert 2003-01-15