Privacy-preserving genomic data publishing via differentially-private suffix tree

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Privacy-preserving data publishing is a mechanism for sharing data while ensuring that the privacy of individuals is preserved in the published data, and utility is maintained for data mining and analysis. There is a huge need for sharing genomic data to advance medical and health researches. However, since genomic data is highly sensitive and the ultimate identifier, it is a big challenge to publish genomic data while protecting the privacy of individuals in the data. In this paper, we address the aforementioned challenge by presenting an approach for privacy-preserving genomic data publishing via differentially-private suffix tree. The proposed algorithm uses a top-down approach and utilizes the Laplace mechanism to divide the raw genomic data into disjoint partitions, and then normalize the partitioning structure to ensure consistency and maintain utility. The output of our algorithm is a differentially-private suffix tree, a data structure most suitable for efficient search on genomic data. We experiment on real-life genomic data obtained from the Human Genome Privacy Challenge project, and we show that our approach is efficient, scalable, and achieves high utility with respect to genomic sequence matching count queries.

Cite

CITATION STYLE

APA

Khatri, T., Dagher, G. G., & Hou, Y. (2019). Privacy-preserving genomic data publishing via differentially-private suffix tree. In Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST (Vol. 304 LNICST, pp. 569–584). Springer. https://doi.org/10.1007/978-3-030-37228-6_28

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free