Coreferencing entities across documents in a large corpus enables advanced document understanding tasks such as question answering. This paper presents a novel cross document coreference approach that leverages the profiles of entities which are constructed by using information extraction tools and reconciled by using a within-document coreference module. We propose to match the profiles by using a learned ensemble distance function comprised of a suite of similarity specialists. We develop a kernelized soft relational clustering algorithm that makes use of the learned distance function to partition the entities into fuzzy sets of identities. We compare the kernelized clustering method with a popular fuzzy relation clustering algorithm (FRC) and show 5% improvement in coreference performance. Evaluation of our proposed methods on a large benchmark disambiguation collection shows that they compare favorably with the top runs in the SemEval evaluation. © 2009 ACL and AFNLP.
CITATION STYLE
Huang, J., Taylor, S. M., Smith, J. L., Fotiadis, K. A., & Giles, C. L. (2009). Profile based cross-document coreference using kernelized fuzzy relational clustering. In ACL-IJCNLP 2009 - Joint Conf. of the 47th Annual Meeting of the Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing of the AFNLP, Proceedings of the Conf. (pp. 414–422). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1687878.1687937
Mendeley helps you to discover research relevant for your work.