Against classification attacks: A decision tree pruning approach to privacy protection in data mining

Against classification attacks: A decision tree pruning approach to privacy protection in data mining

0.00 Avg rating0 Votes
Article ID: iaor200973331
Volume: 57
Issue: 6
Start Page Number: 1496
End Page Number: 1509
Publication Date: Nov 2009
Journal: Operations Research
Authors: ,
Abstract:

Data-mining techniques can be used not only to study collective behavior about customers, but also to discover private information about individuals. In this study, we demonstrate that decision trees, a popular classification technique for data mining, can be used to effectively reveal individuals' confidential data, even when the identities of the individuals are not present in the data. We propose a novel approach for organizations to protect confidential data from such a classification attack. The key components of this approach include a set of entropy-based measures to evaluate disclosure risks of individual records, an optimal pruning algorithm to identify high-risk records, and a pair of data-swapping procedures to reduce the disclosure risks. The proposed method provides the best trade-off between data utility and privacy protection against classification attacks. It can be applied to data with both numeric and categorical attributes. An experimental study on six real-world data sets shows that the proposed method is very effective in protecting privacy while enabling legitimate data mining and analysis.

Reviews

Required fields are marked *. Your email address will not be published.