Hybrid entity clustering using crowds and data

15Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Query result clustering has attracted considerable attention as a means of providing users with a concise overview of results. However, little research effort has been devoted to organizing the query results for entities which refer to real-world concepts, e.g., people, products, and locations. Entity-level result clustering is more challenging because diverse similarity notions between entities need to be supported in heterogeneous domains, e.g., image resolution is an important feature for cameras, but not for fruits. To address this challenge, we propose a hybrid relationship clustering algorithm, called Hydra, using co-occurrence and numeric features. Algorithm Hydra captures diverse user perceptions from co-occurrence and disambiguates different senses using feature-based similarity. In addition, we extend Hydra into HydragData with different sources, i.e., entity types and crowdsourcing. Experimental results show that the proposed algorithms achieve effectiveness and efficiency in real-life and synthetic datasets. © 2013 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Lee, J., Cho, H., Park, J. W., Cha, Y. rok, Hwang, S. won, Nie, Z., & Wen, J. R. (2013). Hybrid entity clustering using crowds and data. VLDB Journal, 22(5), 711–726. https://doi.org/10.1007/s00778-013-0328-8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free