Article ID: | iaor20073475 |
Country: | United States |
Volume: | 50 |
Issue: | 7 |
Start Page Number: | 967 |
End Page Number: | 982 |
Publication Date: | Jul 2004 |
Journal: | Management Science |
Authors: | Jacob Varghese S., Sarkar Sumit, Parssian Amir |
The cost associated with making decisions based on poor-quality data is quite high. Consequently, the management of data quality and the quality of associated data management processes has become critical for organizations. An important first step in managing data quality is the ability to measure the quality of information products (derived data) based on the quality of the source data and associated processes used to produce the information outputs. We present a methodology to determine two data quality characteristics – accuracy and completeness – that are of critical importance to decision makers. We examine how the quality metrics of source data affect the quality for information outputs produced using the relational algebra operations selection, projection, and Cartesian product. Our methodology is general, and can be used to determine how quality characteristics associated with diverse data sources affect the quality of the derived data.