next up previous
Next: Conclusion Up: Likelihood Inference for Spatial Previous: Fitting the Triplets Process

Comparing the Saturated and Triplets Models

Which model, saturation or triplets, fits the data better? Since this is made-up data, simulated from a saturation model, we know that the saturation process should fit better. But is the difference between the two models enough so that we can tell using statistics? And what statistical procedures do we use?

A standard method of testing nonnested hypotheses was proposed by Cox (1961, 1962). Kent (1986) compares Cox's test with other procedures. Here we follow a different notion also suggested by Cox (1962) and followed up by Atkinson (1970) of embedding the two models in a supermodel that contains both. This is particularly easy when both models are exponential families. The supermodel is just the family having the vector canonical statistic that is the union of the statistics for the two models. The supermodel for the saturation and triplets models is the exponential family with the four-dimensional canonical statistic t(x) = (n(x), s(x), w(x), u(x)). The saturation model is the submodel obtained by setting tex2html_wrap_inline4561 , and the triplets model is the submodel obtained by setting tex2html_wrap_inline4563 .

The first order of business is to find the MLE in the combined model. We start at the MLE in the saturation model, which is (4.212, 0, 0, 0.3626), and do two MCL iterations

table1217

giving three to four significant figures. For comparison, the parameter values for the three models are

table1227

As expected, the fit in the combined model is much closer to the fit in the saturated model than the fit in the triplets model, when we make the comparison in the canonical parameter space.

We will test saturated versus combined and triplets versus combined. If one null hypothesis can be rejected and the other not, then we declare the null hypothesis that cannot be rejected to be the one that fits. If both null hypotheses can be rejected, then we declare that neither model fits. If neither null hypothesis can be rejected, then we declare that both models fit well, and there is no statistically significant difference between them. This common garden variety statistics in action, but lest the reader think this inference is easy, Figures 1.4 and 1.5 (following pages) invite an attempt at doing the same inference by eye. It's not an easy task without statistics.

   figure1239
Figure: Simulated point patterns from the saturation process and scatter plot of the distribution of the canonical statistics n(x) and u(x). Three of the point patterns are simulations from the maximum likelihood model; the lower right pattern is the observed data (Figure 1.1). Letters in the scatter plot mark the four patterns, D is the observed data.

   figure1258
Figure: Simulated point patterns from the triplets process and scatter plot of the distribution of the canonical statistics n(x) and s(x). Three of the point patterns are simulations from the maximum likelihood model; the lower right pattern is the observed data (Figure 1.1). Letters in the scatter plot mark the four patterns, D is the observed data.

We now must collect three new samples using the sampler for the combined process for each of these parameter values in order to use reverse logistic regression. We need new samples because we need the four-dimensional canonical statistic output by the sampler for the combined process. Collecting samples of size 10000 at spacing 200 and running reverse logistic regression gives tex2html_wrap_inline4595 for the log inverse normalizing constants of the three distributions with estimated Monte Carlo error variance

  equation1277

now the log likelihood ratio for two models with parameters tex2html_wrap_inline4597 and tex2html_wrap_inline4599 and log normalizing constants tex2html_wrap_inline4601 and tex2html_wrap_inline4603 estimated by reverse logistic regression is tex2html_wrap_inline4605 , and the MC error variance is estimated by the delta method using (1.48) and

displaymath4607

The deviance, twice the log likelihood ratio, has an asymptotic chi-square distribution if the usual asymptotics hold. Assuming they do hold, the results are

table1288

Again we find the result we expected. The saturation model fits the data well. Adding the other two canonical statistics to the model improves the fit no more what one expects whenever two parameters are added to a model. The triplets model does not fit.


next up previous
Next: Conclusion Up: Likelihood Inference for Spatial Previous: Fitting the Triplets Process

Charles Geyer
Fri Jul 5 15:26:21 CDT 1996