Entity annotation involves attaching a label such as 'name' or 'organization' to a sequence of tokens in a document. All the current rule-based and machine learningbased approaches for this task operate at the document level. We present a new and generic approach to entity annotation which uses the inverse index typically created for rapid key-word based searching of a document collection. We define a set of operations on the inverse index that allows us to create annotations defined by cascading regular expressions. The entity annotations for an entire document corpus can be created purely of the index with no need to access the original documents. Experiments on two publicly available data sets show very significant performance improvements over the documentbased annotators. © 2006 Association for Computational Linguistics.
CITATION STYLE
Ramakrishnan, G., Balakrishnan, S., & Joshi, S. (2006). Entity annotation based on inverse index operations. In COLING/ACL 2006 - EMNLP 2006: 2006 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 492–500). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1610075.1610143
Mendeley helps you to discover research relevant for your work.