Article ID: | iaor20081043 |
Country: | United States |
Volume: | 52 |
Issue: | 5 |
Start Page Number: | 658 |
End Page Number: | 670 |
Publication Date: | May 2006 |
Journal: | Management Science |
Authors: | Muralidhar Krishnamurty, Sarathy Rathindra |
Keywords: | data models |
This study discusses a new procedure for masking confidential numerical data – a procedure called data shuffling – in which the values of the confidential variables are ‘shuffled’ among observations. The shuffled data provide a high level of data utility and minimize the risk of disclosure. From a practical perspective, data shuffling overcomes reservations about using perturbed or modified confidential data because it retains all the desirable properties of perturbation methods and performs better than other masking techniques in both data utility and disclosure risk. In addition, data shuffling can be implemented using only rank-order data, and thus provides a nonparametric method for masking. We illustrate the applicability of data shuffling for small and large data sets.