In the present paper we address the issue of how an information retrieval system might be improved via text segmentation and to what extent. We assume that topic text segmentation allows one to better model text structure and therefore language itself, which influences the quality of text representation. We propose a search pipeline based on text segmentation by means of BigARTM tool and TopicTiling algorithm. We test the initial hypothesis by conducting experiments with several baseline models on two textual collections. The results are rather contradictory: while one collection showed that segmentation does improve the quality of retrieval, the other one demonstrated that segmentation does not influence the quality significantly.
CITATION STYLE
Shtekh, G., Kazakova, P., Nikitinsky, N., & Skachkov, N. (2018). Exploring influence of topic segmentation on information retrieval quality. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11193 LNCS, pp. 131–140). Springer Verlag. https://doi.org/10.1007/978-3-030-01437-7_11
Mendeley helps you to discover research relevant for your work.