Article ID: | iaor20014241 |
Country: | United Kingdom |
Volume: | 52 |
Issue: | 3 |
Start Page Number: | 328 |
End Page Number: | 339 |
Publication Date: | Mar 2001 |
Journal: | Journal of the Operational Research Society |
Authors: | Glen J.J. |
Keywords: | programming: integer |
Classification models can be developed by statistical or mathematical programming discriminant analysis techniques. Variable selection extensions of these techniques allow the development of classification models with a limited number of variables. Although stepwise statistical variable selection methods are widely used, the performance of the resultant classification models may not be optimal because of the stepwise selection protocol and the nature of the group separation criterion. A mixed integer programming approach for selecting variables for maximum classification accuracy is developed in this paper and the performance of this approach, measured by the leave-one-out hit rate, is compared with the published results from a statistical approach in which all possible variable subsets were considered. Although this mixed integer programming approach can only be applied to problems with a relatively small number of observations, it may be of great value where classification decisions must be based on a limited number of observations.