Shallow Parsing with Apache UIMA

Graham Wilcock

Journal Article

Shallow Parsing with Apache UIMA

Wilcock G

PACLING 2009 - Conference of the Pacific Association for Computational Linguistics (2009) 23-27

N/ACitations

9Readers

Abstract

Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic annotation and text analytics. Its support for standards, interoperability and scalability makes UIMA attractive for NLP researchers. The paper describes shallow parsing as an example of configuring existing NLP tools to perform a task in the UIMA framework. First, part-of-speech tagging is done using the OpenNLP tagger. Next, full syntactic parsing by the OpenNLP parser is shown. UIMA has ready-made configurations for these tasks. Of course, tagging is fast and full parsing is slow. Shallow parsing was enabled by adding a UIMA wrapper for the OpenNLP chunker and by extending the UIMA type system to include chunk labels. Shallow parsing with the chunker is fast, like tagging. The chunks are displayed in UIMA Annotation Viewer by re-using phrase types already defined for the full parser. 1 Apache UIMA

Cite

CITATION STYLE

APA

Wilcock, G. (2009). Shallow Parsing with Apache UIMA. PACLING 2009 - Conference of the Pacific Association for Computational Linguistics, 23–27. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.158.2595

Shallow Parsing with Apache UIMA

Abstract

Cite

Register to see more suggestions