When Costs Are Unequal and Unknown: A Subtree Grafting Approach for Unbalanced Data Classification

When Costs Are Unequal and Unknown: A Subtree Grafting Approach for Unbalanced Data Classification

0.00 Avg rating0 Votes
Article ID: iaor201112034
Volume: 42
Issue: 4
Start Page Number: 803
End Page Number: 829
Publication Date: Nov 2011
Journal: Decision Sciences
Authors: ,
Keywords: datamining
Abstract:

In binary classifications, a decision tree learned from unbalanced data typically creates an important challenge related to the high misclassification rate of the minority class. Assigning different misclassification costs can address this problem, though usually at the cost of accuracy for the majority class. This effect can be particularly hazardous if the costs cannot be specified precisely. When the costs are unknown or difficult to determine, decision makers may prefer a classifier with more balanced accuracy for both classes rather than a standard or cost-sensitively learned one. In the context of learning trees, this research therefore proposes a new tree induction approach called subtree grafting (STG). On the basis of a real bank data set and several other data sets, we test the proposed STG method and find that our proposed approach provides a successful compromise between standard and cost-sensitive trees.

Reviews

Required fields are marked *. Your email address will not be published.