Article ID: | iaor20061443 |
Country: | United States |
Volume: | 20 |
Issue: | 3 |
Start Page Number: | 231 |
End Page Number: | 238 |
Publication Date: | Jul 2005 |
Journal: | Statistical Science |
Authors: | Hand David J., Veaux Richard D. De |
Keywords: | datamining |
As Huff's landmark book made clear, lying with statistics can be accomplished in many ways. Distorting graphics, manipulating data or using biased samples are just a few of the tried and true methods. Failing to use the correct statistical procedure or failing to check the conditions for when the selected method is appropriate can distort results as well, whether the motives of the analyst are honorable or not. Even when the statistical procedure and motives are correct, bad data can produce results that have no validity at all. This article provides some examples of how bad data can arise, what kinds of bad data exist, how to detect and measure bad data, and how to improve the quality of data that have already been collected.