Next: glm_keys
Up: MacAnova Help File
Previous: getusage()
Contents
Keywords:
glm, anova, categorical data, multivariate analysis, regression
The commands for analyzing linear and generalized linear models are as
follows:
anova(), fastanova() Analysis of Variance
glmfit() Generalized linear model analysis
ipf(), logistic() Logistic Regression
manova(), Multivariate Analysis of Variance
poisson() Log linear models
probit() Probit analysis
regress() Linear Regression
robust() Robust Regression
screen() Best subset linear regression
These are generally referred to as GLM commands in help topics. See
their individual help entries for details. Type help(key:"glm") for a
list of help entries related to analyzing linear and generalized linear
models.
In addition, wtanova(), wtmanova() and wtregress() do weighted ANOVA,
MANOVA and regression. Since the same computations are done when
weights are specified using keyword 'weights' or 'wts' (see below),
these are not further mentioned here.
Function glmfit() is a general function that can, with appropriate
keyword arguments, be used instead of anova(), logistic(), poisson(),
and probit(). In the future, additional options will allow analyses not
possible at present.
All GLM commands have certain elements in common.
The first argument of a GLM command specifies a model as a quoted
string or CHARACTER variable. Examples are regress("y=x1+x2+x3") and
anova("x=a + a.b"). If the model is absent (for example, anova() or
logistic(,n)) the most recent GLM model is assumed or the model in
CHARACTER variable STRMODEL is used. Type help(models) for
information on how to specify a model.
When there are MISSING values in any of the variables in a GLM model,
any case with any MISSING values is omitted entirely. The maximum
level of any factor is taken to be the maximum level on any of the
complete data cases.
All GLM commands but screen() create certain side-effect variables.
The most important are the following (not all may be produced by every
command).
STRMODEL, a CHARACTER scalar containing the model used.
TERMNAMES, a CHARACTER vector containing the names of the terms in
the model including the error terms. When the GLM command does an
iterative fit without keyword phrase 'inc:T' (see topic 'glm_keys'),
the value of TERMNAMES still has the same number of elements but has
the form vector("","",...,"Overall model","ERROR1"), reflecting the
fact that only model and error deviances are computed.
DEPVNAME, a CHARACTER scalar containing the name of the response
variable in the model.
SS, a REAL vector of sums of squares or deviances, one for each term
in the model. For manova() this is an array of SSCP matrices, with
the first subscript indexing the term. Except when 'marginal:T' is
an argument to anova(), manova() or robust(), these are computed
sequentially and measure the importance of a term after fitting
previous terms, and ignoring later terms. The first dimension of SS
has labels identical to TERMNAMES. After manova(), dimensions 2 and
3 are labeled with the column labels of the response variable if it
has labels or by vector("(1)","(2)", ...) otherwise. After a GLM
command that does an iterative fit without keyword phrase 'inc:T',
the value of SS is vector(0,0,...,ModelDeviance,ErrorDeviance).
DF, a REAL vector containing the degrees of freedom associated with
each term in the model. After a GLM command that does an iterative
fit without keyword phrase 'inc:T', the value of DF is
vector(0,0,...,ModelDF,ErrorDF).
RESIDUALS, a REAL vector or matrix of residuals from the fitted
model. For any case with MISSING values in the data, RESIDUALS is
MISSING.
WTDRESIDUALS, a REAL vector or matrix of weighted residuals from the
fitted model. For analyses using iteratively re-weighted least
squares sucu as logistic(), probit(), or poisson(), the weights are
those used on the last iteration. For any case with MISSING values
in the data, WTDRESIDUALS is MISSING. WTDRESIDUALS is not created by
anova(), regress() or manova() unless weights are provided.
XTXINV (regress()), the inverse or generalized inverse of X'X or
X'WX, where X is the n by k matrix of predictors, including the
constant vector if it is in the model, and W is the diagonal matrix
of weights, if any.
HII, the REAL vector of leverages, the diagonal elements of
X(XTXINV)X' or W X(XTXINV)X', where W is the diagonal matrix of
weights, if any.
COEF (regress() only), the model coefficients.
It is an error if any GLM command finds that any side effect variable is
locked (see lockvars(), unlockvars(), 'variables:"locked_variables"').
Besides creating side effect variables, most GLM commands save "private"
information about the analysis. This is used by commands such as
regpred(), contrast(), coefs() and secoefs(). It can be retrieved by
command modelinfo(). This information is not preserved by save() and
asciisave() unless keyword phrase 'all:T' is used. It is discarded when
you assign a value to STRMODEL or delete STRMODEL.
See topic 'glm_keys' for a list of keyword phrases recognized by more
than one GLM command.
Gary Oehlert
2003-01-15