Article ID: | iaor20051917 |
Country: | South Korea |
Volume: | 29 |
Issue: | 4 |
Start Page Number: | 41 |
End Page Number: | 60 |
Publication Date: | Dec 2004 |
Journal: | Journal of the Korean ORMS Society |
Authors: | Kim Jin-Hwa, Min Jin-Young |
Keywords: | datamining |
A stream data is a data set that is accumulated to the data storage from a data source over time continuously. The size of this data set, in many cases, becomes increasingly large over time. To mine information from this massive data, it takes much resource such as storage, memory and time. These unique characteristics of the stream data make it difficult and expensive to use this large size data accumulated over time. Otherwise, if we use only recent or part of a whole data to mine information or pattern, there can be loss of information, which may be useful. To avoid this problem, we suggest a method that efficiently accumulates information, in the form of rule sets, over time. It takes much smaller storage compared to traditional mining methods. These accumulated rule sets are used as prediction models in the future. Based on theories of ensemble approaches, combination of many prediction models, in the form of systematically merged rule sets in this study, is better than one prediction model in performance. This study uses a customer data set that predicts buying power of customers based on their information. This study tests the performance of the suggested method with the data set alone with general prediction methods and compares performances of them.