Article ID: | iaor20073383 |
Country: | Netherlands |
Volume: | 150 |
Issue: | 1 |
Start Page Number: | 79 |
End Page Number: | 92 |
Publication Date: | Mar 2007 |
Journal: | Annals of Operations Research |
Authors: | Bruni Renato |
Keywords: | classification |
The paper is concerned with the problem of binary classification of data records, given an already classified training set of records. Among the various approaches to the problem, the methodology of the logical analysis of data (LAD) is considered. Such approach is based on discrete mathematics, with special emphasis on Boolean functions. With respect to the standard LAD procedure, enhancements based on probability considerations are presented. In particular, the problem of the selection of the optimal support set is formulated as a weighted set covering problem. Testable statistical hypotheses are used. Accuracy of the modified LAD procedure is compared to that of the standard LAD procedure on data sets of the UCI repository. Encouraging results are obtained and discussed.