Article ID: | iaor201110713 |
Volume: | 218 |
Issue: | 7 |
Start Page Number: | 3247 |
End Page Number: | 3264 |
Publication Date: | Dec 2011 |
Journal: | Applied Mathematics and Computation |
Authors: | Sana Shib Sankar, Chaudhuri Kripasindhu, Sarkar Bikash Kanti |
Keywords: | heuristics: genetic algorithms, optimization |
The classification system is very important for making decision and it has been attracted much attention of many researchers. Usually, the traditional classifiers are either domain specific or produce unsatisfactory results over classification problems with larger size and imbalanced data. Hence, genetic algorithms (GA) are recently being combined with traditional classifiers to find useful knowledge for making decision. Although, the main concerns of such GA‐based system are the coverage of less search space and increase of computational cost with the growth of population. In this paper, a rule‐based knowledge discovery model, combining C4.5 (a Decision Tree based rule inductive algorithm) and a new parallel genetic algorithm based on the idea of massive parallelism, is introduced. The prime goal of the model is to produce a compact set of informative rules from any kind of classification problem. More specifically, the proposed model receives a base method C4.5 to generate rules which are then refined by our proposed parallel GA. The strength of the developed system has been compared with pure C4.5 as well as the hybrid system (C4.5+sequential genetic algorithm) on six real world benchmark data sets collected from UCI (University of California at Irvine) machine learning repository. Experiments on data sets validate the effectiveness of the new model. The presented results especially indicate that the model is powerful for volumetric data set.