Shallow Parsing with Apache UIMA

  • Wilcock G
N/ACitations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic annotation and text analytics. Its support for standards, interoperability and scalability makes UIMA attractive for NLP researchers. The paper describes shallow parsing as an example of configuring existing NLP tools to perform a task in the UIMA framework. First, part-of-speech tagging is done using the OpenNLP tagger. Next, full syntactic parsing by the OpenNLP parser is shown. UIMA has ready-made configurations for these tasks. Of course, tagging is fast and full parsing is slow. Shallow parsing was enabled by adding a UIMA wrapper for the OpenNLP chunker and by extending the UIMA type system to include chunk labels. Shallow parsing with the chunker is fast, like tagging. The chunks are displayed in UIMA Annotation Viewer by re-using phrase types already defined for the full parser. 1 Apache UIMA

Cite

CITATION STYLE

APA

Wilcock, G. (2009). Shallow Parsing with Apache UIMA. PACLING 2009 - Conference of the Pacific Association for Computational Linguistics, 23–27. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.158.2595

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free