University of Minnesota, Twin Cities School of Statistics Charlie Geyer's Home Page
Radically Elementary Probability Theory is the title of a book by Edward Nelson (Princeton University Press, 1987, amazon.com web page for his book).
This web page is about University of Minnesota School of Statistics Technical Report No. 657, which is my attempt to fill in some details that Nelson leaves out of his (very short) book. This is a work in progress. I intend to write several more chapters, but needed to turn it into something one of my students could cite.
Both of these books — Nelson's and mine — use nonstandard analysis (Wikipedia entry and Mathworld entry), but a very simple version of nonstandard analysis that IMHO is much easier to understand than conventional measure theory. As I say in my preface
… almost everything one needs to know about nonstandard analysis are the arithmetic rules for infinitesimal, appreciable, and unlimited numbers (a number is unlimited if its reciprocal is infinitesimal, and a number is appreciable if it is neither infinitesimal or unlimited) given in the tables in our Section 3.3, the principle of external induction — an axiom of the nonstandard analysis used in Nelson (1987) and this book (Axiom~IV in Section 2.1) — and the principle of overspill (our Section 3.4).
But the aim of our books is very different from most nonstandard analysis, which merely aims to provide alternative proofs (using infinitesimals) of conventional theorems that also have conventional proofs. We change the subject of study. We do this by limiting our attention to probability models that have
In (i) Robinson-style nonstandard analysis one would say
hyperfinite but in Nelson-style nonstandard analysis
(Wikipedia entry for
internal set theory) we just say
finite, the point being
that infinitesimals and unlimited numbers are real numbers just like any
other real numbers.
The point of (i) is that measure theory becomes completely unnecessary. Every probability and expectation — conditional or unconditional — is just a finite sum given by a simple explicit formula. The point of (ii) is that conditional probability and expectation is always well defined — no need to consider conditioning on events of probability zero (perhaps infinitesimal but not exactly zero).
Another quote from my preface
One might think that thisradicalsimplification is too radical — throwing the baby out with the bathwater — but Nelson (1987) and this book provide some evidence that this is not so. Even though our theory has no continuous random variables or even discrete random variables with infinite sample space, hence no normal, exponential, Poisson, and so forth random variables. We shall see that finite approximations satisfactorily take their place.
Consider a Binomial(n, p) random variable X such that neither p nor 1 − p is infinitesimal and n is unlimited. Then (the Nelson-style analog of) the central limit theorem says that (X - n p) / √p (1 - p) / n has a distribution that isnearly normalin the sense that the distribution function of this random variable differs from the distribution function of the [standard] normal distribution in conventional probability only by an infinitesimal amount at any point (our Theorem 8.2).
Consider a Binomial(n, p) random variable X such that p is infinitesimal but n p is appreciable. Then X has a distribution that isnearly Poissonin the sense that the distribution function of this random variable differs from the distribution function of the Poisson(n p) distribution in conventional probability only by an infinitesimal amount at any point (unfortunately this result is in a yet to be written chapter of this book, but is easily proved).
Consider a Geometric(p) random variable X such that p is infinitesimal and choose a (necessarily infinitesimal) number ε such that Y = ε X has appreciable expectation μ. Then Y has a distribution that isnearly exponentialin the sense that the distribution function of this random variable differs from the distribution function of the Exponential(1 ⁄ μ) distribution in conventional probability only by an infinitesimal amount at any point.
So we actually study nonstandard probability models — like the Binomial(n, p) random variable with unlimited sample size n mentioned above — for their own sake. We do not turn them into continuous random variables.
How interesting this approach is remains to be seen. Since its level
of abstraction is much lower than conventional measure theory, it has
the potential of allowing much more rigor in lower-level courses where
much handwaving and
beyond the scope of this course now occurs.
Consider a Poisson process. The Nelson-style analog is just a sequence
of independent and identically distributed Bernoulli(p) random
variables with infinitesimal p spaced ε apart on the
real line so that λ = p ⁄ ε is appreciable.
It is trivial to show, no more than Pr(A ∩ B) ⁄
Pr(B) type conditional probability, that the interarrival
times in the Bernoulli process are geometric and as in the quotation above
that the interarrival times are
nearly exponential. It is also
easy to show — just the conventional proof of the Poisson approximation
to the binomial — that counts of events in intervals with appreciable
nearly Poisson. No lack of rigor and no handwaving.
You don't need measure theory unless you insist on taking the nonstandard
Bernoulli process to the conventional Poisson process limit.
For more than that, you'll have to read the books.