glmfit([Model] [,dist:distName,link:linkName, n:denom, incr:T,\ print:F or silent:T, maxiter:m, epsilon:eps, coefs:F, offsets:OffVec,\ scale:sigma]), distName and linkName CHARACTER scalars, denom > 0 REAL scalar or vector, integer m > 0, REAL eps > 0, REAL vector OffVec |

glmfit(Model,dist:DistName ,link:LinkName,...) does a generalized linear model analysis with assumed response distribution DistName and link function LinkName, somewhat in the manner of program GLIM. The response variable y must be a vector (isvector(y) is True). See topic 'models' for information on and examples of quoted string or CHARACTER scalar Model. Current legal values for DistName are "binomial", "poisson", and "normal" (or "gaussian"). If DistName is "binomial" or "poisson", you must have y[i] >= 0. Current legal values for LinkName are "logit", "probit", "log", and "identity". If dist:DistName is omitted, the default DistName is "normal". If link:LinkName is omitted the default LinkName depends on DistName -- "logit" for "binomial", "log" for "poisson", and "identity" for "normal". Because of these defaults, glmfit(Model), with no distribution or link specified, is equivalent to anova(Model, unbalanced:T). If DistName is "binomial" you must specify the number of trials using keyword 'n' as for logistic() or probit(). The value Denom for 'n' must either be a REAL scalar >= max(y) or a REAL vector of the same length as y with Denom[i] >= y[i]. Except when DistName is "normal" and LinkName is "identity", an iterative algorithm is used to model link(E[y]) or link(E[y/Denom]) as a linear function of X-variables associated with the right hand side of Model. Normally a two line Analysis of Deviance table is printed. Line 1 is the difference 2*L(1) - 2*L(0), where L(0) is the log likelihood for a model with all coefficients 0 and L(1) is the maximized log likelihood for the model fit. Line 2 is 2*L(2) - 2*L(1) where L(2) is the maximized log likelihood under a model fitting one parameter for every y[i]. Under certain conditions, the latter can be used to test the goodness of fit of the model using a chi-squared test. When DistName is "normal" and LinkName is "identity", an Analysis of Variance table is printed including all terms. glmfit() sets the side effect variables RESIDUALS, WTDRESIDUALS, SS, DF, HII, DEPVNAME, TERMNAMES, and STRMODEL. See topic 'glm'. With DistName is "normal" and LinkName is "identity", SS contains the ANOVA sums of squares; otherwise SS contains deviances. After an iterative fit without keyword phrase 'inc:T' (see below), TERMNAMES has value vector("","", ...,"Overall model","ERROR1"), DF has value vector(0,0, ...,ModelDF,ErrorDF) and SS has value vector(0,0,...,ModelDeviance, ErrorDeviance). glmfit(Model,dist:DistName,link:LinkName,inc:T,...) computes the full fitted model and all partial models -- only a constant term, the constant and the first term, and so on. It prints an Analysis of Deviance table, with one line for each term, representing a difference 2*L(i) - 2*L(i-1) where L(i) is the maximumized log likely for a model including terms 1 through i, plus the deviance of the complete model labeled as "ERROR1". Each line except the last can be used in a chi-squared test to test the significance of the term on the assumption that the true model includes no later terms. The value of 'inc' is ignored when DistName is "normal" and LinkName is "identity". The use of glmfit() provides an alternative method to specify a logistic or probit analysis of binomial responses, or a log linear analysis of Poisson responses. Function DistName LinkName logistic() "binomial" "logit" probit() "binomial" "probit" poisson() "poisson" "log" anova() "normal" "identity" In the future additional distributions such as "gamma" will be implemented, as well as additional links such as "sqrt", "recip", or "power". If you specify an unimplemented combination of LinkName and DistName, an informative error message is printed. When fitting a model with a binomial dependent variable, a warning message similar to the following WARNING: problimit = 1e-08 was hit by glmfit() at least once usually indicates either the presence of an extreme outlier or a best fitting model in which many of the probabilities are almost exactly 0 or 1. The latter case may not represent any problem, since the fitted probabilities at these points will be 1e-8 or 1 - e-8. You can try reducing the threshold using keyword 'problimit' (see below), but you will probably just get the message again. Other Keyword Phrases Keyword phrase Default Meaning maxiter:m 50 Positive integer m is the maximum number of iterations that will be allowed in fitting epsilon:eps 1e-6 Small positive REAL specifying relative error in objective function (2*log likelihood) required to end iteration problimit:small 1e-8 With dist:"binomial", iteration is restricted so that no fitted probabilities are < small or > 1 - small. Value of small must be between 1e-15 and 0.0001. offsets:OffVec none Causes model to be fit to link to be 1*Offvec + Model, where OffVec is a REAL vector the same length as response y. OffVec must be in the same units as the link function, say, logits, logs, or probits. See topic 'glm_keys' for more information and poisson(), logistic() and probit() for examples. scale:sigma 1 sigma must be a positive REAL scalar or ? (MISSING). Its value will replace a default multiplier used by secoefs() and contrast() to compute standard errors. If the value is MISSING, sigma will be computed as sqrt(SS[m]/ DF[m]), where m = length(SS). The default is 1 unless dist is "normal" when it is sqrt(SS[m]/ DF[m]). In secoefs(), scale multiplies the square roots of the diagonal values of the inverse of X'WX, where X is the matrix of X-variables, and W is a diagonal matrix of weights computed using the converged fit. See topic 'glm_keys' for details on keyword phrases print:F, silent:T, coefs:F. See also topics logistic(), poisson(), probit(), 'glm'.

Gary Oehlert 2003-01-15