Feature subset selection for logistic regression via mixed integer optimization

Feature subset selection for logistic regression via mixed integer optimization

0.00 Avg rating0 Votes
Article ID: iaor20162339
Volume: 64
Issue: 3
Start Page Number: 865
End Page Number: 880
Publication Date: Jul 2016
Journal: Computational Optimization and Applications
Authors: , , ,
Keywords: statistics: regression, information, programming: integer, heuristics
Abstract:

This paper concerns a method of selecting a subset of features for a logistic regression model. Information criteria, such as the Akaike information criterion and Bayesian information criterion, are employed as a goodness‐of‐fit measure. The purpose of our work is to establish a computational framework for selecting a subset of features with an optimality guarantee. For this purpose, we devise mixed integer optimization formulations for feature subset selection in logistic regression. Specifically, we pose the problem as a mixed integer linear optimization problem, which can be solved with standard mixed integer optimization software, by making a piecewise linear approximation of the logistic loss function. The computational results demonstrate that when the number of candidate features was less than 40, our method successfully provided a feature subset that was sufficiently close to an optimal one in a reasonable amount of time. Furthermore, even if there were more candidate features, our method often found a better subset of features than the stepwise methods did in terms of information criteria.

Reviews

Required fields are marked *. Your email address will not be published.