Abstract
In this paper, we describe an integrated approach to entity mention detection that yields a monolithic, almost language independent system. It is optimal in the sense that all categorical constraints are simultaneously considered. The system is compact and easy to develop and maintain, since only a single set of features and classifiers are needed to be designed and optimized. It is implemented using oneversus-all support vector machine (SVM) classifiers and a number of feature extractors at several linguistic levels. SVMs are well known for their ability to handle a large set of overlapping features with theoretically sound generalization properties. Data sparsity might be an important issue as a result of a large number of classes and relatively moderate training data size. However, we report results that the integrated system performs as good as a pipelined system that decomposes the problem into a few smaller subtasks. We conduct all our experiments using ACE 2004 data, evaluate the systems using ACE metrics and report competitive performance. © 2005 Association for Computational Linguistics.
Cite
CITATION STYLE
Hacioglu, K., Douglas, B., & Chen, Y. (2005). Detection of entity mentions occurring in English and Chinese text. In HLT/EMNLP 2005 - Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 379–386). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1220575.1220623
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.