Article ID: | iaor201110877 |
Volume: | 52 |
Issue: | 1 |
Start Page Number: | 40 |
End Page Number: | 51 |
Publication Date: | Dec 2011 |
Journal: | Decision Support Systems |
Authors: | Wu Xindong, Zhu Xingquan, Li Bin, He Dan, Zhang Chengqi |
Keywords: | information, systems, networks, heuristics |
The purpose of data mining from distributed information systems is usually threefold: (1) identifying locally significant patterns in individual databases; (2) discovering emerging significant patterns after unifying distributed databases in a single view; and (3) finding patterns which follow special relationships across different data collections. While existing research has significantly advanced the techniques for mining local and global patterns (the first two goals), very little attempt has been made to discover patterns across distributed databases (the third goal). Moreover, no framework currently exists to support the mining of all three types of patterns. This paper proposes solutions to discover patterns from distributed databases. More specifically, we consider pattern mining as a query process where the purpose is to discover patterns from distributed databases with patterns' relationships satisfying user specified query constraints. We argue that existing self‐contained mining frameworks are neither efficient, nor feasible to fulfill the objective, mainly because their pattern pruning is single‐database oriented. To solve the problem, we advocate a cross‐database pruning concept and propose a collaborative pattern (CLAP) mining framework with cross‐database pruning mechanisms for distributed pattern mining. In CLAP, distributed databases collaboratively exchange pattern information between sites so that each site can leverage information from other sites to gain cross‐database pruning. Experimental results show that CLAP fits a niche position, and demonstrate that CLAP not only outperforms its other peers with significant runtime performance gains, but also helps find patterns incapable of being discovered by others.