BUEHLER-MARTIN DISTINGUISHED LECTURER SERIES - March 20, 22, and
23, 2006
University of Minnesota
School of Statistics
College of Liberal Arts
Partition
Models and Cluster Processes
Peter McCullagh
Department of Statistics
University of Chicago
Wednesday, March 22, 2006
3:30 PM, 210
Physics
Minneapolis, East Bank Campus
Social at 3:00 PM, 300 Ford Hall
Abstract
A
partition B of the
set [n] = {1, …, n} is a set of
disjoint non-empty subsets whose union is [n], and
a random partition of [n] is a probability distribution Pn
on set En of partitions of [n].
Statistical applications involving partition models are described,
one connected with cluster analysis,
one connected with Bayesian multiple comparisons,
and one connected with classification.
All applications require the concept of a partition process in which
the relationships among an initial set of n units are unaffected
by the advent of subsequent units.
This non-interference condition implies that
Pn is the marginal distribution of Pn+1 under unit deletion.
A cluster process is an infinite sequence of random variables Y1,
Y2, …
together with a random partition B of the index set.
For a finite set {u1, …, un} consisting of n individuals,
the observation is a finite sequence Y1, …, Yn together with a
random partition B of the observed units.
In an exchangeable cluster process, the joint distribution Pn of
(B, Y1, …, Yn) is invariant under permutation of units,
and Pn is the marginal distribution of Pn+1.
A simple family of processes having these properties is described
together with applications to classification problems.