Abstract
We present a novel Bayesian topic model for learning discourse-level document structure. Our model leverages insights from discourse theory to constrain latent topic assignments in a way that reflects the underlying organization of document topics. We propose a global model in which both topic selection and ordering are biased to be similar across a collection of related documents. We show that this space of orderings can be elegantly represented using a distribution over permutations called the generalized Mallows model. Our structure-aware approach substantially outperforms alternative approaches for cross-document comparison and single-document segmentation.1 © 2009 Association for Computational Linguistics.
Cite
CITATION STYLE
Chen, H., Branavan, S. R. K., Barzilay, R., & Karger, D. R. (2009). Global models of document structure using latent permutations. In NAACL HLT 2009 - Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 371–379). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1620754.1620808
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.