Spring Seminar Series - February 17, 2005
University of Minnesota
School of Statistics
College of Liberal Arts
Regularization and Variable Selection via the Elastic Net
Hui Zou
Department of Statistics
Stanford University
Thursday, February 17, 2005
3:30 PM, 115
Ford Hall
Minneapolis, East Bank Campus
Social at 3:00 PM, 300 Ford Hall
Abstract
In
the practice of statistical modeling, it is often desirable to have an
accurate predictive model with a sparse representation.
The lasso is a promising model building technique,
performing continuous shrinkage and variable selection simultaneously.
Although the lasso has shown success in many situations, it may
produce unsatisfactory results in some scenarios: (1) the number of
predictors (greatly) exceeds the number of observations;
(2) the predictors are highly correlated and form ``groups''.
A typical example is the gene selection problem in microarray analysis.
We propose the elastic net, a new regularization and variable
selection method. Real world data and a simulation study show that
the elastic net often outperforms the lasso, while enjoying a
similar sparsity of representation. In addition, the elastic net
encourages a grouping effect, where strongly correlated predictors
tend to be in or out of the model together. The elastic net is
particularly useful when the number of predictors is much bigger
that the number of samples. We have implemented an algorithm
called LARS-EN for efficiently computing the entire elastic net
regularization path, much like the LARS algorithm does for the
lasso. In this talk, I will also describe some interesting
applications of the elastic net in other statistical areas such as
the sparse principal component analysis and the margin-based
kernel classifier.
This is joint work with Trevor Hastie.