Spring Seminar Series - February 17, 2005
University of Minnesota
School of Statistics
College of Liberal Arts

Regularization and Variable Selection via the Elastic Net

Hui Zou
Department of Statistics
Stanford University

Thursday, February 17, 2005
3:30 PM, 115 Ford Hall
Minneapolis, East Bank Campus
Social at 3:00 PM, 300 Ford Hall

Abstract

In the practice of statistical modeling, it is often desirable to have an accurate predictive model with a sparse representation. The lasso is a promising model building technique, performing continuous shrinkage and variable selection simultaneously. Although the lasso has shown success in many situations, it may produce unsatisfactory results in some scenarios: (1) the number of predictors (greatly) exceeds the number of observations; (2) the predictors are highly correlated and form ``groups''. A typical example is the gene selection problem in microarray analysis.

We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together. The elastic net is particularly useful when the number of predictors is much bigger that the number of samples. We have implemented an algorithm called LARS-EN for efficiently computing the entire elastic net regularization path, much like the LARS algorithm does for the lasso. In this talk, I will also describe some interesting applications of the elastic net in other statistical areas such as the sparse principal component analysis and the margin-based kernel classifier.

This is joint work with Trevor Hastie.