Student Seminar Series - June 27, 2005
University of Minnesota
School of Statistics
College of Liberal Arts

Estimation of Generalization Error: Random and Fixed Inputs


Junhui Wang


Monday, June 27, 2005
2:00 PM, 127 Ford Hall
Minneapolis, East Bank Campus

Refreshments at 1:30 PM
300 Ford Hall


Abstract

In multicategory classification, an estimated generalization error is often used to quantify a classifier's generalization ability. As a result, quality of estimation of the generalization error becomes crucial in tuning and combining classifiers. This proposal proposes an estimation methodology for the generalization error, permitting a treatment of both fixed and random inputs, which is in contrast to the conditional classification error, commonly used in the statistics literature. In particular, we derive a novel data perturbation technique that jointly perturbs both inputs and outputs, to estimate the generalization error. We show that the proposed technique yields optimal tuning and combination, as measured by generalization. We also demonstrate via simulation that it outperforms cross-validation for both fixed and random designs, in the context of margin classification. The results support utility of the proposed methodology.