Student Seminar Series – May 16, 2008
University of Minnesota
School of Statistics
College
of Liberal Arts

 

Penalized Regression Methods and Validation, with Particular Focus on Chemometric Data



Jessica Kraker


Friday, May 16, 2008
2:00 PM,
300 Ford Hall
Minneapolis, East Bank Campus

Refreshments at 1:30 pm
300 Ford Hall

 

Abstract


Beginning with the closed-form ridge regression model (with L2-norm loss and penalty) and advancing to more computationally-intensive methods (such as the lasso and elastic net), the possibilities for penalized regression have progressed dramatically in recent years.   While the methods required to fit these models appropriately require large amounts of computation time, the improvements in prediction accuracy outweigh this concern. 

 

Quantitative Structure Activity/Property Relationship (QSAR/QSPR) models are general methods used in the area of chemometrics to predict a biological activity or property (such as toxicity) of a compound based on p chemical descriptors of various types.  In the context of chemometrics, we analyze prediction problems calling for the concurrent selection of predictors with fitting of the regression model.  Model selection from among several penalized regression models (with different loss and penalty functions) requires the further assessment of the model utility.  Models are included with consideration for the range of predictors implemented by the researcher and for the type of loss function desired.  Programming is implemented in the R environment to obtain and to assess the fitted models.