Article ID: | iaor200952626 |
Country: | United States |
Volume: | 20 |
Issue: | 3 |
Start Page Number: | 423 |
End Page Number: | 437 |
Publication Date: | Jun 2008 |
Journal: | INFORMS Journal On Computing |
Authors: | Padman Rema, Bai Xue, Ramsey Joseph, Spirtes Peter |
Keywords: | heuristics: tabu search |
Data sets with many discrete variables and relatively few cases arise in health care, e–commerce, information security, text mining, and many other domains. Learning effective and efficient prediction models from such data sets is a challenging task. In this paper, we propose a tabu search–enhanced Markov blanket (TS/MB) algorithm to learn a graphical Markov blanket model for classification of high–dimensional data sets. The TS/MB algorithm makes use of Markov blanket neighborhoods: restricted neighborhoods in a general Bayesian network based on the Markov condition. Computational results from real–world data sets drawn from several domains indicate that the TS/MB algorithm, when used as a feature selection method, is able to find a parsimonious model with substantially fewer predictor variables than is present in the full data set. The algorithm also provides good prediction performance when used as a graphical classifier compared with several machine–learning methods.