Abstract
The goal of this thesis is to develop techniques for comparative summarisation of multimodal document collections. Comparative summarisation is extractive summarisation in comparative settings, where documents form two or more groups, e.g. articles on the same topic but from different sources. Comparative summarisation involves, not only, selecting representative and diverse samples within groups, but also samples that highlight commonalities and differences between the groups. We posit that comparative summarisation is a fruitful problem for diverse use cases, such as comparing content over time, authors, or distinct view points. We formulate the problem of comparative summarisation by reducing it to binary classification problem and define objectives to incorporate representativeness, diversity and comparativeness. We design new automatic and crowd-sourced evaluation protocols for summarisation evaluation that scales much better than the evaluations requiring manually created ground truth summaries. We show the efficacy of the approach in a newly curated datasets of controversial news topics. We plan to develop new collection comparison methods for multimodal document collections.
Author supplied keywords
Cite
CITATION STYLE
Bista, U. (2019). Comparative summarisation of rich media collections. In WSDM 2019 - Proceedings of the 12th ACM International Conference on Web Search and Data Mining (pp. 812–813). Association for Computing Machinery, Inc. https://doi.org/10.1145/3289600.3291603
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.