Pattern Sampling in Distributed Databases

4Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Many applications rely on distributed databases. However, only few discovery methods exist to extract patterns without centralizing the data. In fact, this centralization is often less expensive than the communication of extracted patterns from the different nodes. To circumvent this difficulty, this paper revisits the problem of pattern mining in distributed databases by benefiting from pattern sampling. Specifically, we propose the algorithm DDSampling that randomly draws a pattern from a distributed database with a probability proportional to its interest. We demonstrate the soundness of DDSampling and analyze its time complexity. Finally, experiments on benchmark datasets highlight its low communication cost and its robustness. We also illustrate its interest on real-world data from the Semantic Web for detecting outlier entities in DBpedia and Wikidata.

Cite

CITATION STYLE

APA

Diop, L., Diop, C. T., Giacometti, A., & Soulet, A. (2020). Pattern Sampling in Distributed Databases. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12245 LNCS, pp. 60–74). Springer. https://doi.org/10.1007/978-3-030-54832-2_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free