Spring Seminar Series - January 28, 2003
University of Minnesota
School of Statistics
College of Liberal Arts

Misspecification Error in Missing Data Models

Yun Ju Sung
School of Statistics
University of Minnesota

Tuesday, January 28, 2003
4:00 PM, 115 Ford Hall
Minneapolis, East Bank Campus
Social at 3:30 PM, 300 Ford Hall

Abstract

  When a statistical model is incorrect, the MLE is inconsistent, converging to the minimizer θ* of Kullback-Leibler information. Any difference between the density fθ* and the true density g is error due to model misspecification. We propose a Monte Carlo method to find θ* when there are missing data and the observed data likelihood doesn't have closed form. The motivating example was models for mutation accumulation data from statistical genetics.

We prove consistency and asymptotic normality of the Monte Carlo estimate of θ*. The method involves generating two samples, the first for observed data from the true density and the second for missing data from an importance sampling density. The entire second sample is used with each member of the first sample. We show that this results in an asymptotic variance for the estimate smaller than that obtained by using the first sample only once.

If nature, instead of a computer, generates the first sample, then our estimate is a Monte Carlo approximation to the MLE. Now its asymptotic variance reflects sampling variability of the first sample and Monte Carlo variability of the second sample.