Abstract
In this paper we propose a domain-independent text segmentation method, which consists of three components. Latent Dirichlet allocation (LDA) is employed to compute words semantic distribution, and we measure semantic similarity by the Fisher kernel. Finally global best segmentation is achieved by dynamic programming. Experiments on Chinese data sets with the technique show it can be effective. Introducing latent semantic information, our algorithm is robust on irregular-sized segments.
Cite
CITATION STYLE
Sun, Q., Li, R., Luo, D., & Wu, X. (2008). Text segmentation with LDA-based fisher kernel. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 269–272). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1557690.1557768
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.