Article ID: | iaor201523862 |
Volume: | 30 |
Issue: | 7 |
Start Page Number: | 985 |
End Page Number: | 992 |
Publication Date: | Nov 2014 |
Journal: | Quality and Reliability Engineering International |
Authors: | Perner Petra |
Keywords: | datamining, artificial intelligence: expert systems |
Data mining methods are widely used across many disciplines to identify patterns, rules, or associations among huge volumes of data. Data mining methods with explanation capability such as decision tree induction are preferred in many domains. The aim of this paper is to discuss how to deal with the result of decision tree induction methods. This paper has been prompted by the fact that domain experts are able to use the tools for decision tree induction but have great difficulties in interpreting the results. When the domain expert has learnt two decision trees that are from the same domain but based on different data sets as a result of further data collection, he is faced with the problem of how to interpret the different trees. The comparison of two decision trees is therefore an important issue as the user needs such a comparison in order to understand what has changed. We have proposed to provide him with a measure of correspondence between the two trees that allows him to judge if he can accept the changes or not. In this paper, we propose a proper similarity measure. In case of a low similarity value, the expert has evidence to start exploring the reason for this change. Often, he can find things in the data acquisition that might have resulted in some noise and might be fixed.