A single logical entity can be referred to by several different names over a large text corpus. We present our algorithm for finding all such co-reference sets in a large corpus. Our algorithm involves three steps: morphological similarity detection, contextual similarity analysis, and clustering. Finally, we present experimental results on over large corpus of real news text to analyze the performance our techniques. © Springer-Verlag Berlin Heidelberg 2006.
CITATION STYLE
Lloyd, L., Mehler, A., & Skiena, S. (2006). Identifying co-referential names across large corpora. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4009 LNCS, pp. 12–23). Springer Verlag. https://doi.org/10.1007/11780441_3
Mendeley helps you to discover research relevant for your work.