Minimum-distance controlled perturbation methods for large-scale tabular data protection

Minimum-distance controlled perturbation methods for large-scale tabular data protection

0.00 Avg rating0 Votes
Article ID: iaor2007539
Country: Netherlands
Volume: 171
Issue: 1
Start Page Number: 39
End Page Number: 52
Publication Date: May 2006
Journal: European Journal of Operational Research
Authors:
Keywords: programming: linear, programming: quadratic, statistics: general
Abstract:

National Statistical Agencies routinely release large amounts of tabular information. Prior to dissemination, tabular data need to be processed to avoid the disclosure of individual confidential information. One widely used class of methods is based on the modification of the table cells values. However, previous approaches were not able to preserve the values of the marginal cells and the additivity relations for a general table of any dimension, size and structure. This void was recently filled by the controlled tabular adjustment and one of its variants, the quadratic minimum-distance controlled perturbation method. Although independently developed, both approaches rely on the same strategy: given a set of tables to be protected, they find the minimum-distance values to the original cells that make the released information safe. Controlled tabular adjustment uses the L1 distance; the quadratic minimum-distance variant considers L2. This work presents both approaches within a unified framework, and includes a new variant based on L. Among other benefits, the unified framework permits the simple comparison of the three distances, and a single general result about their disclosure risk. The three distances are evaluated with the unique standard library for tabular data protection currently available. Some of the complex instances were contributed by National Statistical Agencies, and, therefore, are good representatives of their real needs. Unlike alternative methods, the three distances were able to solve all the instances, requiring only few seconds for each of them on a personal computer using a general purpose solver. The results show that this class of methods is an effective and promising tool for the protection of large volumes of tabular data. All the linear and quadratic problems solved in the paper are delivered to the optimization community in MPS format.

Reviews

Required fields are marked *. Your email address will not be published.