An increasing number of applications, in domains ranging from bio-medicine to business and to pervasive computing, feature data represented as a long sequence of symbols (string). Sharing these data, however, may lead to the disclosure of sensitive patterns which are represented as substrings and model confidential information. Such patterns may model, for example, confidential medical knowledge, business secrets, or signatures of activity patterns that may risk the privacy of smart-phone users. In this paper, we study the novel problem of con-cealing a given set of sensitive patterns from a string. Our approach is based on injecting a minimal level of uncertainty to the string, by replacing selected symbols in the string with a symbol “∗” that is inter-preted as any symbol from the set of possible symbols that may appear in the string. To realize our approach, we propose an algorithm that efficiently detects occurrences of the sensitive patterns in the string and then sanitizes these sensitive patterns. We also present a preliminary set of experiments to demonstrate the effectiveness and efficiency of our algorithm.
CITATION STYLE
Ajala, O., Alamro, H., Iliopoulos, C., & Loukides, G. (2018). Towards string sanitization. In IFIP Advances in Information and Communication Technology (Vol. 520, pp. 200–210). Springer New York LLC. https://doi.org/10.1007/978-3-319-92016-0_19
Mendeley helps you to discover research relevant for your work.