BUEHLER-MARTIN DISTINGUISHED LECTURER SERIES - March 20, 22, and 23, 2006
University of Minnesota
School of Statistics
College of Liberal Arts

Partition Models and Cluster Processes

Peter McCullagh
Department of Statistics
University of Chicago

Wednesday, March 22, 2006
3:30 PM, 210 Physics
Minneapolis, East Bank Campus
Social at 3:00 PM, 300 Ford Hall

Abstract

A partition B of the set [n] = {1, …, n} is a set of disjoint non-empty subsets whose union is [n], and a random partition of [n] is a probability distribution Pn on set En of partitions of [n]. Statistical applications involving partition models are described, one connected with cluster analysis, one connected with Bayesian multiple comparisons, and one connected with classification. All applications require the concept of a partition process in which the relationships among an initial set of n units are unaffected by the advent of subsequent units. This non-interference condition implies that Pn is the marginal distribution of Pn+1 under unit deletion. A cluster process is an infinite sequence of random variables Y1, Y2, … together with a random partition B of the index set. For a finite set {u1, …, un} consisting of n individuals, the observation is a finite sequence Y1, …, Yn together with a random partition B of the observed units. In an exchangeable cluster process, the joint distribution Pn of (B, Y1, …, Yn) is invariant under permutation of units, and Pn is the marginal distribution of Pn+1. A simple family of processes having these properties is described together with applications to classification problems.