Improving text segmentation with non-systematic semantic relation

Viet Cuong Nguyen; Le Minh Nguyen; Akira Shimazu

Conference Proceedings

Improving text segmentation with non-systematic semantic relation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6608 LNCS(PART 1) 304-315

DOI: 10.1007/978-3-642-19400-9_24

3Citations

7Readers

Get full text

Abstract

Text segmentation is a fundamental problem in natural language processing, which has application in information retrieval, question answering, and text summarization. Almost previous works on unsupervised text segmentation are based on the assumption of lexical cohesion, which is indicated by relations between words in the two units of text. However, they only take into account the reiteration, which is a category of lexical cohesion, such as word repetition, synonym or superordinate. In this research, we investigate the non-systematic semantic relation, which is classified as collocation in lexical cohesion. This relation holds between two words or phrases in a discourse when they pertain to a particular theme or topic. This relation has been recognized via a topic model, which is, in turn, acquired from a large collection of texts. The experimental results on the public dataset show the advantages of our approach in comparison to the available unsupervised approaches. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Nguyen, V. C., Nguyen, L. M., & Shimazu, A. (2011). Improving text segmentation with non-systematic semantic relation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6608 LNCS, pp. 304–315). https://doi.org/10.1007/978-3-642-19400-9_24

Improving text segmentation with non-systematic semantic relation

Abstract

Author supplied keywords

Cite

Register to see more suggestions