Using layout data for the analysis of scientific literature

1Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

It is said that the world knowledge is in the Internet. Scientific knowledge is in the books, journals and conference proceedings. Yet both repositories are too large to skim through manually. We need clever algorithms to cope with the huge amount of information. To filter, sort and ultimately mine the information available it is vital to use every source of information we have. A common technique is to mine the text from the publications, but they are more complex than the text they include. The position of the words gives us clues about their meaning. Additional images either supplement the text or offer proof to a proposition. Tables cannot be understood before deciphering the rows and columns. To deal with the additional information, classic text mining techniques have to be coupled with spatial data and image data. In this chapter, we will give some background to the various techniques, explain the necessary pre-processing steps involved and present two case studies, one from image mining and one from table identification. © 2009 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Mathiak, B., Kupfer, A., & Eckstein, S. (2009). Using layout data for the analysis of scientific literature. Studies in Computational Intelligence, 165, 3–22. https://doi.org/10.1007/978-3-540-88067-7_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free