Article ID: | iaor20043726 |
Country: | Netherlands |
Volume: | 37 |
Issue: | 1 |
Start Page Number: | 83 |
End Page Number: | 102 |
Publication Date: | Apr 2004 |
Journal: | Decision Support Systems |
Authors: | Chung Chin-Wan, Chun Seok-Ju, Lee Seok-Lyong |
Keywords: | databases |
Data cubes support a powerful data analysis method called the range-sum query. The range-sum query is widely used in finding trends and in discovering relationships among attributes in diverse database applications. A range-sum query computes aggregate information over an online analytical process data cube in specified query ranges. Existing techniques for range-sum queries on data cubes use an additional cube called the prefix sum cube (PC), to store the cumulative sums of data, causing a high space overhead. This space overhead not only leads to extra costs for storage devices, but also causes additional propagations of updates and longer access time on physical devices. In this paper, we present a new cube representation called ‘the PC Pool’, which drastically reduces the space of the PC in a large data warehouse. The PC Pool decreases the update propagation caused by the dependency between values in cells of the PC. We develop an effective algorithm, which finds dense sub-cubes from a large data cube. We perform an extensive experiment with diverse data sets, and examine the space reduction and performance of our proposed method with respect to various dimensions of the data cube and query sizes. Experimental results show that our method reduces the space of the PC while having a reasonable query performance.