Research Article Open Access

Randomization of Statistical Queries of Type Median: A Simulation Approach

Jose Daniel Velazco1, Mohammed Awad2 and Ernst L. Leiss3
  • 1 University of Houston-Clear Lake, United States
  • 2 American University of Ras Al Khaimah, United Arab Emirates
  • 3 University of Houston, United States

Abstract

Researcher and third party access to data pertaining to individuals is becoming the norm. The conclusions drawn from such data can be extremely beneficial. However, data owners must maintain the secrecy of the sensitive data fields and make sure it is protected against inference attacks. There are several techniques and restrictions that can be made on queries to prevent adversaries from inferring and identifying sensitive data related to specific individuals. One of the proposed techniques to prevent the disclosure of private data is randomization. In this study, we demonstrate and analyze the implementation of randomization in statistical queries of the selector function median and the results of an extensive simulation. The randomization technique yields a possibly erroneous yet usually reasonably accurate response to every query. In addition, the inference procedure is explained and potential modifications to counter the randomization technique are analyzed and tested against it. We show that, despite these modifications, randomization protects the data by adding uncertainties into the inference procedure, thus, maintaining differential privacy. The results of an extensive simulation testing the various parameters of the randomization technique on randomly generated databases are shown and explained.

Journal of Computer Science
Volume 14 No. 1, 2018, 67-80

DOI: https://doi.org/10.3844/jcssp.2018.67.80

Submitted On: 27 September 2017 Published On: 15 January 2018

How to Cite: Velazco, J. D., Awad, M. & Leiss, E. L. (2018). Randomization of Statistical Queries of Type Median: A Simulation Approach. Journal of Computer Science, 14(1), 67-80. https://doi.org/10.3844/jcssp.2018.67.80

  • 4,059 Views
  • 2,087 Downloads
  • 0 Citations

Download

Keywords

  • Randomization
  • Median Queries
  • Statistical Database Security
  • Inference Attacks