probs <- jackknife(groups,y [,prior:P]), factor or vector of positive integers groups, REAL matrix y and positive vector P with no MISSING elements |

jackknife(G, y), where G is a factor or vector of positive integers of length n and y is a REAL n by p matrix with no MISSING elements, carries out a jackknife validation of linear discriminant functions designed to discriminate among the g groups defined by the levels of G. When you try to estimate the error rate of a classification method by counting the errors it makes in classifying the cases in the "training sample", the data set you are using to estimate the method, your estimate is biased in an optimistic direction. That is, the proportion of cases misclassified in the training sample tends to be smaller than the proportion of cases misclassified in new sample (validation sample). jackknife() attempts to avoid this bias by classifying each case in the training sample with linear discriminant functions computed from all the other cases in the training sample. This is the "leave-one-out" method, sometimes called the Lachenbruch-Mickey method. Macro jackknife() returns a n by g+1 matrix probs. probs[i,j], for j = 1,...,g is an estimate of the posterior probability that the data in y[i,] were derived from population j. probs[i,g+1] is an integer from 1 to g indicating the population in which the case should be classified, that is the population for which the posterior probability is largest. For each i, 1 <= i <= n, the posterior probabilities probs[i,j], j = 1,..., g are computed as follows. The linear discriminant function based on y[-i,], that is using all the data except row i, and the discriminant functions scores for the data in y[i,] are computed. From these the posterior probabilities are computed assuming equal prior probability 1/g for each of the groups. Each group is assumed to be multivariate normal with the same variance- covariance matrix in each group. Because the discriminant functions used to classify y[i,] are computed without using y[i,], the method is close to unbiased. jackknife(G,y,prior:P), where P is a REAL vector of length n with no MISSING elements, does the same except the posterior probabilities are computed using P. Here is how you might use jackknife() to estimate the expected probability of misclassification, assuming the prior probability that a randomly selected case comes from population j is P[j]. Cmd> probs <- jackknife(G, y, prior:P) Cmd> n <- tabs(,G,count:T) # vector of sample sizes Cmd> misclassprob <- tabs(,G,probs[,g+1],count:T)/n Cmd> misclassprob[hconcat(run(g),run(g))] <- 0# set diags to 0 Cmd> sum(misclassprob' %*% P) The last line computes the estimated probability case randomly selected from a group with prior probabilitys P will be misclassified by linear discriminant functions estimated from y. misclassprob[i,j] with i != j is an estimate that a case from population i is misclassifed as population j. This version of jackknife() is relatively fast since it computes the successive leave-one-out discriminant functions by modification of the discriminant functions using all the data, rather than starting fresh.

Gary Oehlert 2003-01-15