Article ID: | iaor200730 |
Country: | Netherlands |
Volume: | 169 |
Issue: | 3 |
Start Page Number: | 898 |
End Page Number: | 917 |
Publication Date: | Mar 2006 |
Journal: | European Journal of Operational Research |
Authors: | Rayward-Smith V.J., Iglesia B. de la, Richards G., Philpott M.S. |
Keywords: | heuristics, datamining |
In this paper, we present an application of multi-objective metaheuristics to the field of data mining. We introduce the data mining task of nugget discovery (also known as partial classification) and show how the multi-objective metaheuristic algorithm NSGA II can be modified to solve this problem. We also present an alternative algorithm for the same task, the ARAC algorithm, which can find all rules that are best according to some measures of interest subject to certain constraints. The ARAC algorithm provides an excellent basis for comparison with the results of the multi-objective metaheuristic algorithm as it can deliver the Pareto optimal front consisting of all partial classification rules that lie in the upper confidence/coverage border, for databases of limited size. We present the results of experiments with various well-known databases for both algorithms. We also discuss how the two methods can be used complementarily for large databases to deliver a set of best rules according to some predefined criteria, providing a powerful tool for knowledge discovery in databases.