Article ID: | iaor20119429 |
Volume: | 51 |
Issue: | 4 |
Start Page Number: | 794 |
End Page Number: | 809 |
Publication Date: | Nov 2011 |
Journal: | Decision Support Systems |
Authors: | Coelho Andr L V, Fernandes Everlndio, Faceli Katti |
Keywords: | programming: multiple criteria, biology |
This paper investigates a genetic programming (GP) approach aimed at the multi‐objective design of hierarchical consensus functions for clustering ensembles. By this means, data partitions obtained via different clustering techniques can be continuously refined (via selection and merging) by a population of fusion hierarchies having complementary validation indices as objective functions. To assess the potential of the novel framework in terms of efficiency and effectiveness, a series of systematic experiments, involving eleven variants of the proposed GP‐based algorithm and a comparison with basic as well as advanced clustering methods (of which some are clustering ensembles and/or multi‐objective in nature), have been conducted on a number of artificial, benchmark and bioinformatics datasets. Overall, the results corroborate the perspective that having fusion hierarchies operating on well‐chosen subsets of data partitions is a fine strategy that may yield significant gains in terms of clustering robustness.