The commands for analyzing linear and generalized linear models are as follows: anova(), fastanova() Analysis of Variance glmfit() Generalized linear model analysis ipf(), logistic() Logistic Regression manova(), Multivariate Analysis of Variance poisson() Log linear models probit() Probit analysis regress() Linear Regression robust() Robust Regression screen() Best subset linear regression These are generally referred to as GLM commands in help topics. See their individual help entries for details. Type help(key:"glm") for a list of help entries related to analyzing linear and generalized linear models. In addition, wtanova(), wtmanova() and wtregress() do weighted ANOVA, MANOVA and regression. Since the same computations are done when weights are specified using keyword 'weights' or 'wts' (see below), these are not further mentioned here. Function glmfit() is a general function that can, with appropriate keyword arguments, be used instead of anova(), logistic(), poisson(), and probit(). In the future, additional options will allow analyses not possible at present. All GLM commands have certain elements in common. The first argument of a GLM command specifies a model as a quoted string or CHARACTER variable. Examples are regress("y=x1+x2+x3") and anova("x=a + a.b"). If the model is absent (for example, anova() or logistic(,n)) the most recent GLM model is assumed or the model in CHARACTER variable STRMODEL is used. Type help(models) for information on how to specify a model. When there are MISSING values in any of the variables in a GLM model, any case with any MISSING values is omitted entirely. The maximum level of any factor is taken to be the maximum level on any of the complete data cases. All GLM commands but screen() create certain side-effect variables. The most important are the following (not all may be produced by every command). STRMODEL, a CHARACTER scalar containing the model used. TERMNAMES, a CHARACTER vector containing the names of the terms in the model including the error terms. When the GLM command does an iterative fit without keyword phrase 'inc:T' (see topic 'glm_keys'), the value of TERMNAMES still has the same number of elements but has the form vector("","",...,"Overall model","ERROR1"), reflecting the fact that only model and error deviances are computed. DEPVNAME, a CHARACTER scalar containing the name of the response variable in the model. SS, a REAL vector of sums of squares or deviances, one for each term in the model. For manova() this is an array of SSCP matrices, with the first subscript indexing the term. Except when 'marginal:T' is an argument to anova(), manova() or robust(), these are computed sequentially and measure the importance of a term after fitting previous terms, and ignoring later terms. The first dimension of SS has labels identical to TERMNAMES. After manova(), dimensions 2 and 3 are labeled with the column labels of the response variable if it has labels or by vector("(1)","(2)", ...) otherwise. After a GLM command that does an iterative fit without keyword phrase 'inc:T', the value of SS is vector(0,0,...,ModelDeviance,ErrorDeviance). DF, a REAL vector containing the degrees of freedom associated with each term in the model. After a GLM command that does an iterative fit without keyword phrase 'inc:T', the value of DF is vector(0,0,...,ModelDF,ErrorDF). RESIDUALS, a REAL vector or matrix of residuals from the fitted model. For any case with MISSING values in the data, RESIDUALS is MISSING. WTDRESIDUALS, a REAL vector or matrix of weighted residuals from the fitted model. For analyses using iteratively re-weighted least squares sucu as logistic(), probit(), or poisson(), the weights are those used on the last iteration. For any case with MISSING values in the data, WTDRESIDUALS is MISSING. WTDRESIDUALS is not created by anova(), regress() or manova() unless weights are provided. XTXINV (regress()), the inverse or generalized inverse of X'X or X'WX, where X is the n by k matrix of predictors, including the constant vector if it is in the model, and W is the diagonal matrix of weights, if any. HII, the REAL vector of leverages, the diagonal elements of X(XTXINV)X' or W X(XTXINV)X', where W is the diagonal matrix of weights, if any. COEF (regress() only), the model coefficients. It is an error if any GLM command finds that any side effect variable is locked (see lockvars(), unlockvars(), 'variables:"locked_variables"'). Besides creating side effect variables, most GLM commands save "private" information about the analysis. This is used by commands such as regpred(), contrast(), coefs() and secoefs(). It can be retrieved by command modelinfo(). This information is not preserved by save() and asciisave() unless keyword phrase 'all:T' is used. It is discarded when you assign a value to STRMODEL or delete STRMODEL. See topic 'glm_keys' for a list of keyword phrases recognized by more than one GLM command.

Gary Oehlert 2003-01-15