Text segmentation based on document understanding for information retrieval

Violaine Prince; Alexandre Labadié

Conference Proceedings

Text segmentation based on document understanding for information retrieval

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4592 LNCS 295-304

DOI: 10.1007/978-3-540-73351-5_26

29Citations

27Readers

Get full text

Abstract

Information retrieval needs to match relevant texts with a given query. Selecting appropriate parts is useful when documents are long, and only portions are interesting to the user. In this paper, we describe a method that extensively uses natural language techniques for text segmentation based on topic change detection. The method requires a NLP-parser and a semantic representation in Roget-based vectors. We have run the experiment on French documents, for which we have the appropriate tools, but the method could be transposed to any other language with the same requirements. The article sketches an overview of the NL understanding environment functionalities, and the algorithms related to our text segmentation method. An experiment in text segmentation is also presented and its result in an information retrieval task is shown. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Prince, V., & Labadié, A. (2007). Text segmentation based on document understanding for information retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4592 LNCS, pp. 295–304). Springer Verlag. https://doi.org/10.1007/978-3-540-73351-5_26

Text segmentation based on document understanding for information retrieval

Abstract

Cite

Register to see more suggestions