Text analysis and information extraction from Spanish written documents

Roberto Costumero; Ángel García-Pedrero; Consuelo Gonzalo-Martín; Ernestina Menasalvas; Socorro Millan

Conference Proceedings

Text analysis and information extraction from Spanish written documents

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8609 LNAI 188-197

DOI: 10.1007/978-3-319-09891-3_18

7Citations

34Readers

Get full text

Abstract

Despite of the spread of Electronic Health Records (EHRs) in Spanish hospitals and Spanish occupying the second place in the ranking of number of speakers, to the best of our knowledge there are no natural language processing tools for medical texts written in Spanish. This paper presents an approach based on OpenNLP to process natural language texts written in Spanish for information extraction. The main goal is to integrate our development with cTAKES. As cTAKES has been specifically trained for the clinical domain, in this paper we will train the main modules from a general purpose annotated Spanish corpus and an in-house corpus developed with medical documents, testing both on a set of medical documents. Best performance of individual components when tested with medical documents: Sentence boundary detector accuracy = 0.872; Part-of-speech tagger accuracy = 0.946; chunker = 0.909. © 2014 Springer International Publishing.

Author supplied keywords

Cite

CITATION STYLE

APA

Costumero, R., García-Pedrero, Á., Gonzalo-Martín, C., Menasalvas, E., & Millan, S. (2014). Text analysis and information extraction from Spanish written documents. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8609 LNAI, pp. 188–197). Springer Verlag. https://doi.org/10.1007/978-3-319-09891-3_18

Text analysis and information extraction from Spanish written documents

Abstract

Author supplied keywords

Cite

Register to see more suggestions