The grouped author-topic model for unsupervised entity resolution

Andrew M. Dai; Amos J. Storkey

Conference Proceedings

The grouped author-topic model for unsupervised entity resolution

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6791 LNCS(PART 1) 241-249

DOI: 10.1007/978-3-642-21735-7_30

16Citations

33Readers

Get full text

Abstract

This paper describes a generative approach for tackling the problem of identity resolution in a completely unsupervised context with no fixed assumption regarding the true number of identities. The problem of entity resolution involves associating different references to authors (in a paper's author list, for example) with real underlying identities. The references may be written in differing forms or may have errors, and identical references may refer to different real identities. The approach taken here uses a generative model of both the abstract of a document and its list of authors to resolve identities in a corpus of documents. In the model, authors and topics are associated with latent groups. For each document, an abstract and an author list are generated conditioned on a given group. Results are presented on real-world datasets, and outperform the best performing unsupervised methods. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Dai, A. M., & Storkey, A. J. (2011). The grouped author-topic model for unsupervised entity resolution. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6791 LNCS, pp. 241–249). https://doi.org/10.1007/978-3-642-21735-7_30

The grouped author-topic model for unsupervised entity resolution

Abstract

Author supplied keywords

Cite

Register to see more suggestions