Student Seminar Series - May 11, 2006
University of Minnesota
School of Statistics
College of Liberal Arts
On
L1-norm Multi-class Support Vector Machines: Methodology Theory and
Applications
Lifeng
Wang
Thursday, May 11, 2006
10:00 AM, 170
Ford Hall
Minneapolis, East Bank Campus
Refreshments at 9:30 AM
300 Ford Hall
Abstract
Binary Support Vector Machines have proven to deliver high
performance. In multi-class classification, however, issues remain
with respect to variable selection. One challenging issue is
classification and variable selection in presence of a large number
of variables in the magnitude of thousands, which greatly exceeds
the size of training sample. This often occurs in genomics
classification. To meet the challenge, this article proposes a novel
multi-class support vector machine, which performs classification
and variable selection simultaneously through an L1-norm penalized
sparse representation. The proposed methodology, together with the
developed regularization solution path, permits variable selection
in such a situation. For the proposed methodology, a statistical
learning theory is developed to quantify the generalization error,
where the number of variables is allowed to grow much faster than
the sample size. The operating characteristics of the methodology
are examined via both simulated and benchmark data, and are compared
against some competitors in terms of accuracy of prediction. The
numerical results suggest that the proposed methodology is highly
competitive.