Annealing Markov Chain Monte Carlo with Applications to Pedigree Analysis

by Charles J. Geyer and Elizabeth A. Thompson
Technical Report No. 589
School of Statistics
University of Minnesota
July 1, 1993

Research supported in part by NSF Grant DMS-9007833.
Research supported in part by NSF Grant BSR-8921839 and NIH Grant GM-46255.


Abstract

Markov chain Monte Carlo (the Metropolis-Hastings algorithm) has been used for many statistical problems including Bayesian inference, likelihood inference and tests of significance. Though the method often works well, doubts about convergence remain in all applications. Here we propose Markov chain Monte Carlo methods, distantly related to simulated annealing. These samplers mix rapidly enough to be usable even in very hard problems for which other methods do not mix even in astronomical amounts of computing time. Our samplers simulate realizations from a sequence of distributions, either simulating all simultaneously ("swapping complex") or allowing the distribution being simulated to vary randomly over time ("jumping samplers"). If the sequence of distributions is well chosen, the sampler will mix well and produce accurate answers for all the distributions. Even when there is only one distribution of interest, these annealing-like samplers may be the only known way to get a rapidly mixing sampler.

These methods are essential for attacking very hard problems, which arise in areas such as statistical genetics. We illustrate the methods with an application that is much harder than any problem previously done by Markov chain Monte Carlo. It involves ancestral inference on a very large genealogy (7 generations, 2024 individuals). The problem is to find, conditional of data on living individuals, the probabilities of each individual having been a carrier of cystic fibrosis. The unconditional probabilities are easy to calculate. Exact calculation of the conditional probabilities involves an undoable sum over all 2^2024 possible patterns of carriers. A Gibbs sampler for the problem would not mix in a reasonable time, even on the fastest imaginable computers. Our annealing-like samplers have mixing times of a few hours.

The methods may also be useful for much easier problems. It is a common concern about Markov chain Monte Carlo that one can never be sure that the answers are correct; i.e. that the chain was well mixed. Although we have no guaranteed convergence bounds for our methods, it does seem that annealing-like samplers are overkill in easy problems and should dispel doubts about convergence. As an example, we show a jumping sampler foa conditional Strauss process (a spatial point process with attraction between points), which even though much easier than the genetics example is still somewhat difficult ("sticky") when done using samplers that only move one point at a time.


Click here to download the complete PostScript technical report.