Semantic search log k-anonymization with generalized k-cores of query concept graph

6Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Search log k-anonymization is based on the elimination of infrequent queries under exact (or nearly exact) matching conditions, which usually results in a big data loss and impaired utility. We present a more flexible, semantic approach to k-anonymity that consists of three steps: query concept mining, automatic query expansion, and affinity assessment of expanded queries. Based on the observation that many infrequent queries can be seen as refinements of a more general frequent query, we first model query concepts as probabilistically weighted n-grams and extract them from the search log data. Then, after expanding the original log queries with their weighted concepts, we find all the k-affine expanded queries under a given affinity threshold Θ, modeled as a generalized k-core of the graph of Θ-affine queries. Experimenting with the AOL data set, we show that this approach achieves levels of privacy comparable to those of plain k-anonymity while at the same time reducing the data losses to a great extent. © 2013 Springer-Verlag.

Cite

CITATION STYLE

APA

Carpineto, C., & Romano, G. (2013). Semantic search log k-anonymization with generalized k-cores of query concept graph. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7814 LNCS, pp. 110–121). https://doi.org/10.1007/978-3-642-36973-5_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free