We present a text mining environment that supports entitycentric mining of terascale historical newspaper collections. Information about entities and their relation to each other is often crucial for historical research. However, most text mining tools provide only very basic support for dealing with entities, typically at most including facilities for entity tagging. Historians, on the other hand, are typically interested in the relations between entities and the contexts in which these are mentioned. In this paper, we focus on person entities.We provide an overview of the tool and describe how person-centric mining can be integrated in a general-purpose text mining environment. We also discuss our approach for automatically extracting person networks from newspaper archives, which includes a novel method for person name disambiguation, which is particularly suited for the newspaper domain and obtains state-of-the-art disambiguation results.
CITATION STYLE
Coll Ardanuy, M., Knauth, J., Beliankou, A., van den Bos, M., & Sporleder, C. (2016). Person-centric mining of historical newspaper collections. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9819 LNCS, pp. 320–331). Springer Verlag. https://doi.org/10.1007/978-3-319-43997-6_25
Mendeley helps you to discover research relevant for your work.