Next: mulvarhelp() Up: Multivariate Macros Help File Previous: mlcrit()   Contents

jackknife()

Usage:
probs <- jackknife(groups,y [,prior:P]), factor or vector of positive
  integers groups, REAL matrix y and positive vector P with no MISSING
  elements



Keywords: discrimination, classification
jackknife(G, y), where G is a factor or vector of positive integers of
length n and y is a REAL n by p matrix with no MISSING elements, carries
out a jackknife validation of linear discriminant functions designed to
discriminate among the g groups defined by the levels of G.

When you try to estimate the error rate of a classification method by
counting the errors it makes in classifying the cases in the "training
sample", the data set you are using to estimate the method, your
estimate is biased in an optimistic direction.  That is, the proportion
of cases misclassified in the training sample tends to be smaller than
the proportion of cases misclassified in new sample (validation sample).
jackknife() attempts to avoid this bias by classifying each case in the
training sample with linear discriminant functions computed from all the
other cases in the training sample.  This is the "leave-one-out" method,
sometimes called the Lachenbruch-Mickey method.

Macro jackknife() returns a n by g+1 matrix probs.

probs[i,j], for j = 1,...,g is an estimate of the posterior probability
that the data in y[i,] were derived from population j.

probs[i,g+1] is an integer from 1 to g indicating the population in
which the case should be classified, that is the population for which
the posterior probability is largest.

For each i, 1 <= i <= n, the posterior probabilities probs[i,j], j =
1,..., g are computed as follows.

The linear discriminant function based on y[-i,], that is using all the
data except row i, and the discriminant functions scores for the data
in y[i,] are computed.  From these the posterior probabilities are
computed assuming equal prior probability 1/g for each of the groups.
Each group is assumed to be multivariate normal with the same variance-
covariance matrix in each group.

Because the discriminant functions used to classify y[i,] are computed
without using y[i,], the method is close to unbiased.

jackknife(G,y,prior:P), where P is a REAL vector of length n with no
MISSING elements, does the same except the posterior probabilities are
computed using P.

Here is how you might use jackknife() to estimate the expected
probability of misclassification, assuming the prior probability that a
randomly selected case comes from population j is P[j].

  Cmd> probs <- jackknife(G, y, prior:P)

  Cmd> n <- tabs(,G,count:T) # vector of sample sizes

  Cmd> misclassprob <- tabs(,G,probs[,g+1],count:T)/n

  Cmd> misclassprob[hconcat(run(g),run(g))] <- 0# set diags to 0

  Cmd> sum(misclassprob' %*% P)

The last line computes the estimated probability case randomly selected
from a group with prior probabilitys P will be misclassified by
linear discriminant functions estimated from y.  misclassprob[i,j] with
i != j is an estimate that a case from population i is misclassifed as
population j.

This version of jackknife() is relatively fast since it computes the
successive leave-one-out discriminant functions by modification of the
discriminant functions using all the data, rather than starting fresh.


Gary Oehlert 2003-01-15