Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic annotation and text analytics. Its support for standards, interoperability and scalability makes UIMA attractive for NLP researchers. The paper describes shallow parsing as an example of configuring existing NLP tools to perform a task in the UIMA framework. First, part-of-speech tagging is done using the OpenNLP tagger. Next, full syntactic parsing by the OpenNLP parser is shown. UIMA has ready-made configurations for these tasks. Of course, tagging is fast and full parsing is slow. Shallow parsing was enabled by adding a UIMA wrapper for the OpenNLP chunker and by extending the UIMA type system to include chunk labels. Shallow parsing with the chunker is fast, like tagging. The chunks are displayed in UIMA Annotation Viewer by re-using phrase types already defined for the full parser. 1 Apache UIMA
CITATION STYLE
Wilcock, G. (2009). Shallow Parsing with Apache UIMA. PACLING 2009 - Conference of the Pacific Association for Computational Linguistics, 23–27. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.158.2595
Mendeley helps you to discover research relevant for your work.