probit([Model], n:Denom [, print:F or silent:T, incr:T, offsets:vec,\ pvals:T, maxiter:m, epsilon:eps, coefs:F]), Denom REAL scalar or vector > 0, vec a REAL vector, m an integer > 0, eps REAL > 0 |

probit(Model,n:Denom) computes a probit regression fit of the model specified in the CHARACTER variable Model. If y is the response variable in the model it must be a REAL vector with y[i] >= 0. Denom must either be an REAL scalar >= max(y) or a REAL vector of the same length as y with Denom[i] >= y[i]. Estimation is by maximum likelihood on the assumption that y[i] is binomial with Denom[i] trials (Denom trials for scalar DENOM). If any y[i] or n[i] is not an integer a warning message is printed. To get the coefficients for a classic probit analysis, you should increase the estimated constant by 5. probit(Model,n:Denom,...) is equivalent to glmfit(Model,n:Denom, dist:"binomial", link:"probit",...). See topic 'models' for information on specifying Model. probit() sets the side effect variables RESIDUALS, WTDRESIDUALS, SS, DF, HII, DEPVNAME, TERMNAMES, and STRMODEL. See topic 'glm'. Without keyword phrase 'inc:T' (see below), TERMNAMES has value vector("","", ...,"Overall model","ERROR1"), DF has value vector(0,0,...,ModelDF, ErrorDF) and SS has value vector(0,0,...,ModelDeviance,ErrorDeviance). If, say, Model is "y=x1+x2", an iterative algorithm is used to predict invnor(E[y/Denom]) = probit(E[y/Denom]) - 5 as a linear function of x1 and x2. A two line Analysis of Deviance table is printed. Line 1 is the difference 2*L(1) - 2*L(0), where L(0) is the log likelihood for a model with all coefficients 0 (all probabilities = 0.5) and L(1) is the maximized log likelihood for the model fit. Line 2 is 2*L(2) - 2*L(1) where L(2) is the maximized log likelihood under a model fitting one parameter for every y[i]. Under appropriate assumptions, the latter can be used to test the goodness of fit of the model using a chi-squared test. probit(Model,n:Denom,inc:T) computes the full probit model and all partial models -- only a constant term, the constant and the first term, and so on. It prints an Analysis of Deviance table, with one line for each term, representing a difference 2*L(i) - 2*L(i-1) where L(i) is the maximumized log likely for a model including terms 1 through i, plus the deviance of the complete model labeled as "ERROR1". Each line except the last can be used in a chi-squared test to test the significance of the term on the assumption that the true model includes no later terms. If you omit Model (probit(,n:Denom ...)), the model from the most recent GLM command such as poisson() or anova(), or the model in CHARACTER variable STRMODEL is assumed. Computations are carried out using iteratively reweighted least squares. If you get a warning message similar to the following WARNING: problimit = 1e-08 was hit by probit() at least once it usually indicates either the presence of an extreme outlier or a best fitting model in which many of the probabilities are almost exactly 0 or 1. The latter case may not represent any problem, since the fitted probabilities at these points will be 1e-8 or 1 - e-8. You can try reducing the threshold using keyword 'problimit' (see below), but you will probably just get the message again. Other keyword phrases Keyword phrase Default Meaning maxiter:m 50 Positive integer m is the maximum number of iterations that will be allowed in fitting epsilon:eps 1e-6 Small positive REAL specifying relative error in objective function (2*log likelihood) required to end iteration problimit:small 1e-8 Iteration is restricted so that no fitted probabilities are < small or > 1 - small. Value of small must be between 1e-15 and 0.0001. offsets:OffVec none Causes model to be fit to probit(p) to be 1*Offvec + Model, where OffVec is a REAL vector the same length as response y. Note OffVec is in probit units. See topic 'glm_keys' for information on keyword phrases print:F, silent:T, coefs:F The default value for pvals can be changed by setoptions(pvals:T). See topics setoptions(), 'options'. Examples of the use of 'offsets'. Cmd> probit("y=x", n:15, offsets:3*x, inc:T, pvals:T) The P value associated with x can be used to test the hypothesis H0: beta1 = 3 in the model invnor(p) = beta0 + beta1*x. Cmd> probit("y=1", n:20, offsets:rep(invnor(.25),length(y)),\ inc:T, pvals:T) The P value associated with the CONSTANT term can be used to test H0: p = .25, assuming y contains a random sample from a binomial distribution with n = 20.

Gary Oehlert 2003-01-15