Article ID: | iaor20132415 |
Volume: | 59 |
Issue: | 4 |
Start Page Number: | 796 |
End Page Number: | 812 |
Publication Date: | Apr 2013 |
Journal: | Management Science |
Authors: | Sarkar Sumit, Li Xiao-Bai |
Keywords: | risk |
The extensive use of information technologies by organizations to collect and share personal data has raised strong privacy concerns. To respond to the public's demand for data privacy, a class of clustering‐based data masking techniques is increasingly being used for privacy‐preserving data sharing and analytics. Although they address reidentification risks, traditional clustering‐based approaches for masking numeric attributes typically do not consider the disclosure risk of categorical confidential attributes. We propose a new approach to deal with this problem. The proposed method clusters data such that the data points within a group are similar in the nonconfidential attribute values, whereas the confidential attribute values within a group are