Text segmentation techniques: A critical review

39Citations
Citations of this article
69Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Text segmentation is a method of splitting a document into smaller parts, which is usually called segments. It is widely used in text processing. Each segment has its relevant meaning. Those segments categorized as word, sentence, topic, phrase or any information unit depending on the task of the text analysis. This study presents various reasons of usage of text segmentation for different analyzing approaches. We categorized the types of documents and languages used. The main contribution of this study includes a summarization of 50 research papers and an illustration of past decade (January 2007−January 2017)’s of research that applied text segmentation as their main approach for analysing text. Results revealed the popularity of using text segmentation in analysing different languages. Besides that, the word segment seems to be the most practical and usable segment, as it is the smaller unit than the phrase, sentence or line.

Cite

CITATION STYLE

APA

Pak, I., & Teh, P. L. (2018). Text segmentation techniques: A critical review. In Studies in Computational Intelligence (Vol. 741, pp. 167–181). Springer Verlag. https://doi.org/10.1007/978-3-319-66984-7_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free