Article ID: | iaor201112034 |
Volume: | 42 |
Issue: | 4 |
Start Page Number: | 803 |
End Page Number: | 829 |
Publication Date: | Nov 2011 |
Journal: | Decision Sciences |
Authors: | Zhu Dan, Lee Jong-Seok |
Keywords: | datamining |
In binary classifications, a decision tree learned from unbalanced data typically creates an important challenge related to the high misclassification rate of the minority class. Assigning different misclassification costs can address this problem, though usually at the cost of accuracy for the majority class. This effect can be particularly hazardous if the costs cannot be specified precisely. When the costs are unknown or difficult to determine, decision makers may prefer a classifier with more balanced accuracy for both classes rather than a standard or cost-sensitively learned one. In the context of learning trees, this research therefore proposes a new tree induction approach called subtree grafting (STG). On the basis of a real bank data set and several other data sets, we test the proposed STG method and find that our proposed approach provides a successful compromise between standard and cost-sensitive trees.