Abstract
A pair of RDF instances are said to corefer when they are intended to denote the same thing in the world, for example, when two nodes of type foaf:Person describe the same individual. This problem is central to integrating and inter-linking semi-structured datasets. We are developing an online, unsupervised coreference resolution framework for heterogeneous, semi-structured data. The online aspect requires us to process new instances as they appear and not as a batch. The instances are heterogeneous in that they may contain terms from different ontologies whose alignments are not known in advance. Our framework encompasses a two-phased clustering algorithm that is both flexible and distributable, a probabilistic multidimensional attribute model that will support robust schema mappings, and a consolidation algorithm that will be used to perform instance consolidation in order to improve accuracy rates over time by addressing data spareness. © 2012 Springer-Verlag Berlin Heidelberg.
Cite
CITATION STYLE
Sleeman, J. (2012). Online unsupervised coreference resolution for semi-structured heterogeneous data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7650 LNCS, pp. 457–460). Springer Verlag. https://doi.org/10.1007/978-3-642-35173-0_39
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.