Over the past 15 years the government has funded research in information extraction, with the goal of developing the technology to extract entities, events, and their interrelationships from free text for further analysis. A crucial component of linking entities across documents is the ability to recognize when different name strings are potential references to the same entity. Given the extraordinary range of variation international names can take when rendered in the Roman alphabet, this is a daunting task. This paper surveys existing technologies for name matching and for accomplishing pieces of the cross-document extraction and linking task. It proposes a direction for future work in which existing entity extraction, coreference, and database name matching technologies would be harnessed for cross-document coreference and linking capabilities. The extension of name variant matching to free text will add important text mining functionality for intelligence and security informatics toolkits.
Mendeley saves you time finding and organizing research
There are no full text links
Choose a citation style from the tabs below