Abstract

Matthew Blackwell, James Honaker, and Gary King, "Multiple Overimputation: A Unified Approach to Measurement Error and Missing Data", December 2012

Although social scientists devote considerable effort to mitigating measurement error during data collection, they usually ignore the issue during data analysis. And although many statistical methods have been proposed for reducing measurement error-induced biases, few have been widely used because of implausible assumptions, high levels of model dependence, difficult computation, or inapplicability with multiple mismeasured variables. We develop an easy-to-use alternative without these problems; it generalizes the popular multiple imputation (MI) framework by treating missing data problems as a special case of extreme measurement error and corrects for both. Like mi, the proposed “multiple overimputation” (MO) framework is a simple two-step procedure. First, multiple (≈ 5) completed copies of the data set are created where cells measured without error are held constant, those missing are imputed from the distribution of predicted values, and cells (or entire variables) with measurement error are “overimputed,” that is imputed from the predictive distribution with observation-level priors defined by the mismeasured values and available external information, if any. In the second step, analysts can then run whatever statistical method they would have run on each of the overimputed data sets as if there had been no missingness or measurement error; the results are then combined via a simple averaging procedure. We also offer easy-to-use open source software that implements all the methods described herein.

back