Most documents are about more than one subject, but many NLP and IR techniques implicitly assume documents have just one topic. We describe new clues that mark shifts to new topics, novel algorithms for identifying topic boundaries and the uses of such boundaries once identified. We report topic segmentation performance on several corpora as well as improvement on an IR task that benefits from good segmentation.
CITATION STYLE
Reynarl, J. C. (1999). Statistical models for topic segmentation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1999-June, pp. 357–364). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1034678.1034735
Mendeley helps you to discover research relevant for your work.