Article ID: | iaor20164043 |
Volume: | 32 |
Issue: | 4 |
Start Page Number: | 646 |
End Page Number: | 667 |
Publication Date: | Nov 2016 |
Journal: | Computational Intelligence |
Authors: | Shin Kilho, Miyazaki Seiya |
Keywords: | statistics: regression, combinatorial optimization |
Consistency‐based feature selection is an important category of feature selection research, and its advantage over other categories is due to consistency measures used to include the effect of interaction among features into evaluation of relevance of features. Even if features individually appear irrelevant to class labels, they can collectively show strong relevance. In such cases, we say that the features interact with each other. Consistency measures, in this regard, evaluate the collective relevance of a set of features and has been intuitively understood as a metric to measure a distance of an arbitrary feature set from the state of being consistent: A set of features is said to be consistent if, and only if, they as a whole determine class labels. In history, the binary consistency measure, which returns the value 1 if the feature set is consistent and 0 otherwise, was the first consistency measure introduced, and many advanced measures followed. The problem of the binary measure consists in the fact that it always returns 1 if a data set includes no consistent feature set. The measures that followed have solved this problem but sacrificed time efficiency of evaluation. Therefore, feature selection leveraging these measures are not fast enough to apply to large data sets. In this article, we aim to improve time efficiency of consistency‐based feature selection. To achieve the goal, we propose a new idea, which we call data set denoising: We eliminate examples which are viewed as noises from a data set until the data set becomes to include consistent feature sets and then apply the binary measure to find an appropriate feature set that is consistent. In our evaluation through intensive experiments, CWC, a new algorithm that implements data set denoising outperformed in both time efficiency and accuracy the benchmark consistency‐based algorithms. Specifically, CWC was about 31 times faster than the LCC that had been known as the fastest in the literature. Furthermore, in a comparison including feature selection algorithms that are not consistency‐based, CWC has turned out to be one of the fastest and the most accurate feature selection algorithms.