Spring 2001 Seminar Series - March 1, 2001
University of Minnesota
School of Statistics
College of Liberal Arts

Full Bayesian inference under Dirichlet process mixture models with applications

Athanasios Kottas
Duke University
(Statistics Search Candidate)

Thursday, March 1, 2001
4:00 PM, B10 Ford Hall
Minneapolis, East Bank Campus
Social at 3:30 PM, 300 Ford Hall

Abstract

Dirichlet process mixture models form a very rich class of nonparametric mixtures which provides modeling for the unknown population distribution by employing a mixture of parametric distributions with a random mixing distribution assumed to be a realization from a Dirichlet process (Ferguson, 1973). Simulation-based model fitting of Dirichlet process mixture models is well established in the literature by now, the common characteristic of the Markov chain Monte Carlo methods devised being the marginalization over the mixing distribution. However, this feature results to rather limited inference regarding functionals associated with the random mixture distribution. In particular, only posterior moments of linear functionals can be handled.
We provide a computational approach to obtain the entire posterior distribution for more general functionals. The approach uses the Sethuraman representation (Sethuraman, 1994) of the Dirichlet process, after fitting the model, to obtain posterior samples of the random mixing distribution. Then, a Monte Carlo integration is used to convert each such sample to a random draw from the posterior distribution of the functional of interest. Hence, arbitrarily accurate inference is available for the functional and for comparing it across populations.
The range of inferences the approach covers is illustrated by considering several applications of Dirichlet process mixture models. We discuss modeling approaches for stochastically ordered distributions and for the errors in semiparametric median regression models. We also develop Dirichlet process mixture models for distributions on the positive real line, having direct applications in reliability, inference for queuing systems and survival analysis. Full inference is obtained for various functionals of interest in this setting, including the median survival time and the population density, survival, cumulative hazard and hazard functions.