Statistics 8931 (Geyer) Fall 2005

Course Announcement

Markov Chain Monte Carlo
Stat 8931
Fall 2005
1:25-2:15 MWF
FordH 115

Instructor:	Charles Geyer (5-8511, `charlie@stat.umn.edu`)

Markov chain Monte Carlo (MCMC) is simulation of a probability model in which the simulations X₁, X₂, ... form a Markov chain rather than being independent and identically distributed, as is the case in old-fashioned ordinary Monte Carlo (OMC).

It is the most general form of simulation implementable on a computer. If one considers a computer program having a main loop that each time through the loop makes some random (or pseudorandom, if you prefer that terminology) modification of the program state and we consider X_i to be the program state at the bottom of the i-th time through the loop, then this is a Markov chain.

MCMC is incredibly useful. It can simulate any probability model, and calculate (approximately) any probability or expectation. It makes practical many forms of statisticial inference that were impossible before MCMC (not to mention many other uses outside of statistics).

The theory of OMC is just elementary statistics. Suppose we want to evaluate an expectation E f(X) using simulations X₁, X₂, ..., X_n IID from the same distribution as X. Define Y_i = f (X_i). Then the expectation we are trying to estimate is the unknown true population mean of the Y_i, which are an IID sample from this population. Anyone who has had introductory statistics knows what to do now (a z test or confidence interval or a t test or confidence interval if n is small).

MCMC is much the same, only a few things change.

The samples (simulations) X₁, X₂, ... are no longer either independent or identically distributed.
We arrange things so that they are asymptotically identically distributed (the Markov chain theory term is positive recurrent).
We arrange things to that the asymptotic marginal distribution of the X_i (the Markov chain theory term is the invariant or the stationary distribution) is the distribution we want to know about.
Because the samples are dependent, we must use methods from time series analysis to estimate the asymptotic variance of the sample mean of the Y_i.

These issues take us out of the realm of introductory statistics, but they are things every statistician should know.

The course will cover both theory and practice of MCMC.

In the realm of theory, we will follow the textbook of Meyn and Tweedie, unfortunately now out of print, but fortunately now on-line courtesy of Sean Meyn. Much of the book is not relevant to MCMC, but much is, especially Chapters 2, 3, 4, 9, 16, and 17.

In the realm of practice, we will not follow a textbook, the available monographs on the subject (Robert and Casella, 2004; Liu, 2001; Chen, Shao, and Ibrahim, 2000; W. R. Gilks, S. Richardson, D. J. Spiegelhalter, 1995) being not to my taste. However, we will cover the basic algorithm (the Metropolis-Hastings-Green algorithm) and its many variants.

MCMC is often blathered about by certain statisticians as a method of Bayesian inference (in fact some of the books I dislike mentioned above do that). It is that, of course, but it is much more. It is (as I said above) the most general possible simulation method. It is useful for any probabilistic problem in which probabilities and expectations are difficult to calculate exactly and must be done by Monte Carlo (simulation).

Hence we will spend some time on non-Bayesian statistical applications, in particular on MCMC likelihood inference (for problems with complex dependence, such as in spatial statistics, or for problems with latent variables, such as generalized linear mixed models).