Toward comprehensive event collections

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Web archives, such as the Internet Archive, preserve an unprecedented abundance of materials regarding major events and transformations in our society. In this paper, we present an approach for building event-centric sub-collections from such large archives, which includes not only the core documents related to the event itself but, even more importantly, documents describing related aspects (e.g., premises and consequences). This is achieved by identifying relevant concepts and entities from a knowledge base, and then detecting their mentions in documents, which are interpreted as indicators for relevance. We extensively evaluate our system on two diachronic corpora, the New York Times Corpus and the US Congressional Record; additionally, we test its performance on the TREC KBA Stream Corpus and on the TREC-CAR dataset, two publicly available large-scale web collections.

Cite

CITATION STYLE

APA

Nanni, F., Ponzetto, S. P., & Dietz, L. (2020). Toward comprehensive event collections. International Journal on Digital Libraries, 21(2), 215–229. https://doi.org/10.1007/s00799-018-0246-x

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free