A careful examination of our proofs of irreducibility and geometric ergodicity for our point process samplers shows that they tell us very little about the behaviour of the samplers in actual use. The irreducibility proof shows that the chain will eventually reach the empty pattern with no points, but in practice we will never see this event of exceedingly small probability. Thus the proof tells us nothing about the behaviour of the sampler during any run we are likely to have patience to endure. The proof of geometric ergodicity does tell us something of practical value. If we were to start the chain at a pattern having a great many points, far more than the average number under the stationary distribution, the sampler would move toward mode of the stationary distribution geometrically fast, losing points until it reaches some bounded set C on which we do not bother to follow its behaviour. The set C may be very large. It could be any bounded set, it could be so large that it contains with high probability the entire sample path of any run we have patience to endure. So while geometric ergodicity tells us something, it does not tell us much.
The main benefit of proving geometric ergodicity seems to be the implication that the CLT holds. Of course one cannot calculate the variance (1.26), but one can estimate it using time series methods or regeneration. Examples will appear later in this chapter. This is not much different from most applications in asymptotics in statistics. The theorems have mainly heuristic value, giving one a calculation that will be approximately valid if n is large enough, but no one can tell how large is large enough.
In early statistical papers on MCMC, such as Gelfand and Smith (1990),
the CLT was avoided because simple conditions that would imply it were
not known. The
connexion between geometric drift and geometric ergodicity was given
by Nummelin and Tuominen (1982) and Nummelin (1984). The CLT given by
Chan and Geyer (1994) follows from the fact that Harris recurrence
implies what is called
-mixing in the stationary stochastic
process literature (Bradley, 1986) and that geometric ergodicity
implies exponentially fast
-mixing. Although the proofs
are almost identical, the latter seems not to have been noticed
before Chan and Geyer (1994). That exponentially fast
-mixing
implies a CLT is in Ibragimov and Linnik (1971, Theorem 18.5.3).
The other version of the CLT involving geometric drift is also recent
(Meyn and Tweedie, 1992, 1993). Lacking useful conditions implying
a CLT, statisticians avoided it in MCMC, which is exceedingly peculiar
given the prominence of the CLT in other areas of statistics.
This shows the power of theory to control practice even when the
actual relevance of the theory is questionable.
While the importance of the theory presented here may be small and of little comfort to the practical-minded statistician. There is a point to having proofs of what can be proved. Lacking proofs may set people to wondering and worrying unnecessarily. For Markov chain samplers that converge much faster than any known samplers for general point process there is a theory due to Rosenthal (1995) that gives bounds on total variation distance, but this seems not to be useful for chains that take many tens of iterations to mix well, see the appendix of the contribution of Møller to this volume.