Spring Seminar Series  March 1, 2007

University of Minnesota
School of Statistics
College of Liberal Arts

Aggregation and Sparsity in High Dimensions: l1 Regularization and On-line Algorithms

Florentina Bunea
Department of Statistics
Florida State University

Thursday, March 1, 2007
3:30 PM, 115 Ford Hall
Minneapolis, East Bank Campus
Social at 3:00 PM, 300 Ford Hall

Abstract

Model combining and model selection are intrinsically related to any data analysis and are receiving increasing attention in high dimensions. Combining and selection strategies can be analyzed under the unifying framework of aggregation. In this talk I will focus on aggregation for conditional mean estimation of elements of large dictionaries of functions. I will discuss a number of theoretical merits of a popular and computationally efficient aggregation procedures: 1 penalized least squares. 

Aggregation based on 1 penalized least squares will be shown to have qualities that are agreeable to both the proponents of model selection and to those advocating model averaging. This result relies on a novel type of oracle inequality on the risk of the aggregate.

In addition, if the target function f has a sparse, but unknown approximation within the given dictionary, the 1 penalized least squares aggregate will adapt to this unknown sparsity. The adaptation properties will be given in terms of finite sample oracle inequalities. Such results hold under general assumptions and are especially useful when the size of the dictionary is much larger than the sample size. I will introduce and discuss a class of problems in Neuroscience where these findings are of importance.

 I will also discuss briefly aggregation via on-line combining algorithms and show that the risk of the corresponding aggregate has optimal aggregation bounds. As an application, I will show how the on-line algorithms can be used as an alternative to selecting the tuning parameter in 1 penalized least squares procedures.