Article ID: | iaor19982555 |
Country: | United States |
Volume: | 9 |
Issue: | 1 |
Start Page Number: | 1 |
End Page Number: | 14 |
Publication Date: | Dec 1997 |
Journal: | INFORMS Journal On Computing |
Authors: | Martin Kipp, Bala Mohan |
Keywords: | information, programming: integer |
A critical step in the process of creating a relational data base is normalization, whereby a series of tables is created in order to avoid redundancy problems and insertion and deletion anomalies. Several important normalization decision problems are NP-Complete. In this paper we develop a unified approach to data base normalization using linear integer programming. These linear integer programming formulations result from structuring normalization problems as gainfree Leontief substitution flow problems on hypergraphs. We provide a sufficient condition based on the structural properties of the hypergraph which yields a strongly polynomial algorithm for testing whether or not a given decomposition is in third normal form. We also show that under these conditions the key of a relation is unique and a relation in third normal form is also in Boyce–Codd normal form. We also provide linear integer programming formulations for solving normalization decision problems when no special structure is present. We test these linear integer programming models for normalization against existing normalization algorithms and find the mathematical programming approach is several orders of magnitude faster. Our results suggest that the mathematical programming approach is practical for normalizing actual large scale data bases.