Article ID: | iaor20021925 |
Country: | United States |
Volume: | 13 |
Issue: | 1 |
Start Page Number: | 2 |
End Page Number: | 28 |
Publication Date: | Dec 2001 |
Journal: | INFORMS Journal On Computing |
Authors: | Zionts Stanley, Ramesh R., Gopal Ram D. |
Keywords: | computers: data-structure |
The focus of this work is on join optimization in relational database systems. The importance of join optimization is critically underscored by the high cost of relational joins and their frequent needs in traditional as well as emerging database applications. We demonstrate that the sequence in which the pages of the relations are accessed to process a join is a critical determinant of the join execution cost and an optimization of this sequence can lead to a significant improvement in performance over the traditional approaches. We initially develop three network structures to represent a join on two relations: page connectivity graph, cascade and block tree cascade. A page connectivity graph is a bipartite representation of the set of connected pages in the two relations according to the join predicate. To reveal the structural properties of the join, the nodes of the bipartite graph are ordered into a set of levels, and the resulting isomorphic structure is termed a cascade. From the cascade, a tree structure termed a block tree cascade is derived by selectively grouping nodes at each level of the cascade into blocks. We formulate the join as a tree traversal process, and accordingly develop efficient tree traversal algorithms. We develop a compact data structure to store the resulting access path, and provide a comprehensive analysis of the algorithms with detailed assessments of their performance. The performance evaluation demonstrates that the proposed approach can result in significant cost savings over the current join processing methods, for low to modest values of the join selectivity factor.