Article ID: | iaor2003436 |
Country: | China |
Volume: | 16 |
Issue: | 4 |
Start Page Number: | 289 |
End Page Number: | 295 |
Publication Date: | Aug 2001 |
Journal: | Journal of Systems Engineering and Electronics |
Authors: | Zhao Weidong, Li Qihao |
Keywords: | sets |
Decision trees are a useful tool for data mining, but the design of the optimal decision tree has been proved to be NP-hard. Disadvantages existing in previous algorithms such as ID3, C4.5, are analyzed in this paper. Then some important problems about decision trees are discussed in detail in the context of optimization. In order to avoid extending branches overmuch for every non-leaf node, discernibleness in rough set theory is introduced to the partition of nominal attributes and a genetic algorithm is used for better solutions. The discretization of continuous attributes is unavoidable. Based on the above analysis, a new algorithm of decision tree induction is proposed. The experimental results show that the algorithm is better.