The realization of a text analysis process as a sequential execution of the algorithms in a pipeline does not mimic the way humans approach text analysis tasks. Humans simultaneously investigate lexical, syntactic, semantic, and pragmatic clues in and about a text (McCallum 2009) while skimming over the text to fastly focus on the portions of text relevant for a task (Duggan and Payne 2009). From a machine viewpoint, however, the decomposition of a text analysis process into single executable steps is a prerequisite for identifying relevant information types and their interdependencies. Until today, this decomposition and the subsequent construction of a text analysis pipeline are mostly made manually, which prevents the use of pipelines for tasks in ad-hoc text mining. Moreover, such pipelines do not focus on the task-relevant portions of input texts, making their execution slower than necessary (cf. Sect. 2.2). In this chapter, we show that both parts of pipeline design (i.e., construction and task-specific execution) can be fully automated, once given adequate formalizations of text analysis.
CITATION STYLE
Doyle, A. C. (2015). Pipeline design. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9383, pp. 55–121). Springer Verlag. https://doi.org/10.1007/978-3-319-25741-9_3
Mendeley helps you to discover research relevant for your work.