Article ID: | iaor20101502 |
Volume: | 174 |
Issue: | 1 |
Start Page Number: | 147 |
End Page Number: | 168 |
Publication Date: | Feb 2010 |
Journal: | Annals of Operations Research |
Authors: | Lee Eva K, Brooks J Paul |
Keywords: | classification |
Classification is concerned with the development of rules for the allocation of observations to groups, and is a fundamental problem in machine learning. Much of previous work on classification models investigates two-group discrimination. Multi-category classification is less-often considered due to the tendency of generalizations of two-group models to produce misclassification rates that are higher than desirable. Indeed, producing ‘good’ two-group classification rules is a challenging task for some applications, and producing good multi-category rules is generally more difficult. Additionally, even when the ‘optimal’ classification rule is known, inter-group misclassification rates may be higher than tolerable for a given classification model. We investigate properties of a mixed-integer programming based multi-category classification model that allows for the pre-specification of limits on inter-group misclassification rates. The mechanism by which the limits are satisfied is the use of a reserved judgment region, an artificial category into which observations are placed whose attributes do not sufficiently indicate membership to any particular group. The method is shown to be a consistent estimator of a classification rule with misclassification limits, and performance on simulated and real-world data is demonstrated.