Spring Seminar Series - January 28, 2003
University of Minnesota
School of Statistics
College of Liberal Arts
Misspecification Error in
Missing Data Models
Yun Ju Sung
School of Statistics
University of Minnesota
Tuesday, January 28, 2003
4:00 PM, 115
Ford Hall
Minneapolis, East Bank Campus
Social at 3:30 PM, 300
Ford Hall
Abstract
When a statistical model is incorrect, the MLE is inconsistent, converging
to the minimizer θ* of Kullback-Leibler information. Any difference
between the density fθ* and the true density
g is error due to model misspecification. We propose a Monte Carlo
method to find θ* when there are missing data and the observed
data likelihood doesn't have closed form. The motivating example was models
for mutation accumulation data from statistical genetics.
We prove consistency and asymptotic normality of the Monte Carlo estimate
of θ*. The method involves generating two samples, the first for
observed data from the true density and the second for missing data from
an importance sampling density. The entire second sample is used with each
member of the first sample. We show that this results in an asymptotic variance
for the estimate smaller than that obtained by using the first sample only
once.
If nature, instead of a computer, generates the first sample, then our
estimate is a Monte Carlo approximation to the MLE. Now its asymptotic
variance reflects sampling variability of the first sample and Monte Carlo
variability of the second sample.