Web-based sources for an annotated corpus building and composite proper name identification

Sofía N. Galicia-Haro; Alexander Gelbukh; Igor A. Bolshakov

Conference Proceedings

Web-based sources for an annotated corpus building and composite proper name identification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 3034 115-124

DOI: 10.1007/978-3-540-24681-7_14

1Citations

2Readers

Get full text

Abstract

Nowadays, collections of texts with annotations on several levels are useful resources. Huge efforts are required to develop this resource for langua-ges like Spanish. In this work, we present the initial step, lexical level annotati-on, for the compilation of an annotated Mexican corpus using Web-based sour-ces. We also describe a method based on heterogeneous knowledge and simple Web-based sources for the proper name identification required in such annota-tion. We focused our work on composite entities (names with coordinated constituents, names with several prepositional phrases, and names of songs, books, movies, etc.). The preliminary obtained results are presented.

Cite

CITATION STYLE

APA

Galicia-Haro, S. N., Gelbukh, A., & Bolshakov, I. A. (2004). Web-based sources for an annotated corpus building and composite proper name identification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3034, pp. 115–124). Springer Verlag. https://doi.org/10.1007/978-3-540-24681-7_14

Web-based sources for an annotated corpus building and composite proper name identification

Abstract

Cite

Register to see more suggestions