Fall Seminar Series - November 13, 2003
University of Minnesota
School of Statistics
College of Liberal Arts
The benefits of assuming
independence in classification when there are many more variables than observations
Liza Levina
Department of Statistics
University of Michigan
Thursday, November 13,
2003
4:00 PM, 115
Ford Hall
Minneapolis, East
Bank Campus
Social at 3:30 PM,
300
Ford Hall
Abstract
While general statistical intuition tells us that using all the available
dependence information is better than not using it, we show just the opposite
in the case when there are many more variables than observations (the "large
p, small n" scenario). This phenomenon is well known in machine learning practice,
and will be demonstrated on examples from texture classification and gene
expression data. Analytically, we consider the issue in the classical context
of discriminating between two normal populations, and prove that the "naive
Bayes" classifier based on the independence assumption greatly outperforms
the discriminant rule which attempts to estimate the full covariance structure.
We also show how in practice shrinkage can further improve on Naive Bayes.
For the special case of stationary covariance structure, we introduce a
class of rules spanning the range between independence and arbitrary dependence
and prove they achieve Bayes optimality for the Gaussian colored noise model.