Working with large and unstructured collections of historical documents is a challenging task for historians. Despite the recent growth in the volume of digitized historical data, available collections are rarely accompanied by computational tools that significantly facilitate this task.We address this shortage by proposing a visualization method for document collections that focuses on graphical representation of similarities between documents. The strength of the similarities is measured according to the overlap of historically significant information such as named entities,or the overlap of general vocabulary. Similarity strengths are then encoded in the edges of a graph.The graph provides visual structure, revealing interpretable clusters and links between documents that are otherwise difficult to establish. We implement the idea of similarity graphs within an information retrieval system supported by an interactive graphical user interface. The system allows querying the database, visualizing the results and browsing the collection in an effective and intuitive way. Our aproach can be easy adapted and extended to collections of documents in other domains.
CITATION STYLE
Berzak, Y., Richter, M., Ehrler, C., & Shore, T. (2011). Information Retrieval and Visualization for the Historical Domain. In Language Technology for Cultural Heritage (pp. 197–212). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-20227-8_11
Mendeley helps you to discover research relevant for your work.