Spring Seminar Series - January 26, 2006
University of Minnesota
School of Statistics
College of Liberal Arts

Margin-based Semi-supervised Learning

Junhui Wang
School of Statistics
University of Minnesota

Thursday, January 26, 2006
3:30 PM, 115 Ford Hall
Minneapolis, East Bank Campus
Social at 3:00 PM, 300 Ford Hall

Abstract


In classification, semi-supervised learning occurs when a large amount of unlabeled data is available with only a small number of labeled data. In such a situation, how to enhance predictability of classification through unlabeled data is the focus. In this talk, we introduce a novel margin-based semi-supervised learning methodology, utilizing grouping information from unlabeled data, together with the concept of margins, in a form of regularization controlling the interplay between labeled and unlabeled data. In addition, we estimate the generalization error using both labeled and unlabeled data, for tuning in regularization. The methodology is implemented for support vector machines (SVM) as well as $\psi$-learning through difference convex programming, which reduces to sequential quadratic programming. Finally, our theoretical and numerical analyses indicate that the proposed methodology achieves the desired objective of delivering high performance in generalization, particularly against SVM with labeled data alone as well as transductive SVM.