Improving text segmentation with non-systematic semantic relation

3Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Text segmentation is a fundamental problem in natural language processing, which has application in information retrieval, question answering, and text summarization. Almost previous works on unsupervised text segmentation are based on the assumption of lexical cohesion, which is indicated by relations between words in the two units of text. However, they only take into account the reiteration, which is a category of lexical cohesion, such as word repetition, synonym or superordinate. In this research, we investigate the non-systematic semantic relation, which is classified as collocation in lexical cohesion. This relation holds between two words or phrases in a discourse when they pertain to a particular theme or topic. This relation has been recognized via a topic model, which is, in turn, acquired from a large collection of texts. The experimental results on the public dataset show the advantages of our approach in comparison to the available unsupervised approaches. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Nguyen, V. C., Nguyen, L. M., & Shimazu, A. (2011). Improving text segmentation with non-systematic semantic relation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6608 LNCS, pp. 304–315). https://doi.org/10.1007/978-3-642-19400-9_24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free