Feature minimization within decision trees

Feature minimization within decision trees

0.00 Avg rating0 Votes
Article ID: iaor19991972
Country: Netherlands
Volume: 10
Issue: 2
Start Page Number: 111
End Page Number: 126
Publication Date: May 1998
Journal: Computational Optimization and Applications
Authors: ,
Keywords: classification
Abstract:

Decision trees for classification can be constructed using mathematical programming. Within decision tree algorithms, the feature minimization problem is to construct accurate decisions using as few features or attributes within each decision as possible. Feature minimization is an important aspect of data mining since it helps identify what attributes are important and helps produce accurate and interpretable decision trees. In feature minimization with bounded accuracy, we minimize the number of features using a given misclassification error tolerance. This problem can be formulated as a parametric bilinear program and is shown to be NP-complete. A parametric Frank-Wolfe method is used to solve the bilinear subproblems. The resulting minimization algorithm produces more compact, accurate, and interpretable trees. This procedure can be applied to many different error functions. Formulations and results for two error functions are given. One method, FMRLP-P, dramatically reduced the number of features of one dataset from 147 to 2 while maintaining an 83.6% testing accuracy. Computational results compare favorably with the standard univariate decision tree method, C4.5, as well as with linear programming methods of tree construction.

Reviews

Required fields are marked *. Your email address will not be published.