Mining the twentieth century’s history from the time magazine corpus

15Citations
Citations of this article
84Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper we report on an explorative study of the history of the twentieth century from a lexical point of view. As data, we use a diachronic collection of 270,000+ English-language articles harvested from the electronic archive of the well-known Time Magazine (1923–2006). We attempt to automatically identify significant shifts in the vocabulary used in this corpus using efficient, yet unsupervised computational methods, such as Parsimonious Language Models. We offer a qualitative interpretation of the outcome of our experiments in the light of momentous events in the twentieth century, such as the Second World War or the rise of the Internet. This paper follows up on a recent string of frequentist approaches to studying cultural history (‘Culturomics’), in which the evolution of human culture is studied from a quantitative perspective, on the basis of lexical statistics extracted from large, textual data sets.

Cite

CITATION STYLE

APA

Kestemont, M., Karsdorp, F., & Düring, M. (2014). Mining the twentieth century’s history from the time magazine corpus. In Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, LaTeCH 2014 at the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014 (pp. 62–70). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-0609

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free