Everybody has them, nobody wants them!
Often empirical researchers are confronted with missing values in their data sets. As the phenomenon is usually not seen as a possible threat to the validity of the reseach, the most common approach to this problem is simply to deny it. However, a closer look to the data often reveals 5% to 20% of missing values in a few variables, reducing the available data for any multivariate analysis considerably.
Moreover, often these blind spots were not dropped randomly all over the responses. We find special socio-economic groups or minorities disproportionately struck by missing values. Even worse, if the missingness depends on the variable of interest itself, like it is common that the highest income appears to be unknown. The same happens when e.g. populations with worst health conditions or high at risk refuse to be sampled. Finally, the quality of response deteriorates with long and boring questionnaires like they are common practice in media research.
In all these cases, missing data can be a threat to the research and the remaining data are all but representive for the population of interest. Thus, in general, we have found multiple imputation to be a very helpful and powerful tool to get the right answers even in the presence of nonresponse.
|copyright © 2003 by susanne
last modified Feb 27 2003