Student Seminar Series - July 27, 2007
University of Minnesota
School of Statistics
College of Liberal Arts
Handling
Missing Data in a Longitudinal Model: A Simulation Exercise
Sara
Krohn
Friday, July 27, 2007
11:00 AM, 110
Ford Hall
Minneapolis, East Bank Campus
Refreshments
at 10:30 AM
300 Ford Hall
Abstract
Missing data are common in surveys and studies, especially longitudinal ones. In fact, having complete data for all subjects is a rarity
in longitudinal studies as often times subjects will not show up for visits, machinery will not be working correctly, etc. While we can still
perform statistics with missing data, we must be careful in the inferences that we draw from these analyses as there are many implications
for the type of missing data you may have - missing completely at random (MCAR), missing at random (MAR), or non-ignorable (NI).
Today many methods exist for handling MCAR and MAR data, many of which include imputing the missing values using imputation
techniques ranging from very simple, i.e. last value carried forward, to quite complex, i.e. multiple imputation. While this seems like the
ideal thing to do, it can cause problems with underestimating the variance in a model if the imputed values are analyzed as actual values
in a subsequent model fit. In this paper, we generate data that are similar to those in the Body Composition sub0study of the Strategies
for Management of Anti-Retroviral Therapy (SMART) trial and compare five methods for handling the different types of missing data
found in the actual study through a computer simulation exercise. We discuss the methodologies behind the simulation and conclude that
leaving the missing data as missing in a double-repeated measures longitudinal analysis most accurately captured the characteristics of
the true data set in this situation.