Spring Seminar Series  January 25, 2007
University of Minnesota
School of Statistics
College of Liberal Arts

Consistent Model Selection and Data-driven Smooth Tests for Clustered Data

Lan Wang
School of Statistics
University of Minnesota

Thursday, January 25, 2007
3:30 PM, 115 Ford Hall
Minneapolis, East Bank Campus
Social at 3:00 PM, 300 Ford Hall


Abstract

An important problem facing marginal regression analysis of clustered data, as in the method of generalized estimating equations, is how to choose a marginal regression model from a number of candidate models. Although several methods have been suggested in the literature for practical use, theoretical investigation of the large sample theory is still lacking. We propose a new BIC-type model selection criterion in this paper, and prove that with probability approaching one it selects the most parsimonious correct model. The model selection criterion uses a recently proposed quadratic inference function and does not need to specify the full likelihood or quasilikelihood. This model selection procedure also motivates a data-driven Neyman-type smooth test for checking the goodness-of-fit of a conjectured model. Compared to the classical tests which require the specification of an alternative, such as the GEE Z-test, the new test selects a data-driven alternative based on model selection and leads to increased power performance in general. Numerical simulations and data analysis will be discussed to illustrate the application.  (Joint work with Annie Qu)