Teambeam - meta-data extraction from scientific literature

Roman Kern; Kris Jack; Maya Hristakeva; Michael Granitzer

Journal Article

Teambeam - meta-data extraction from scientific literature

D-Lib Magazine (2012) 18(7-8)

DOI: 10.1045/july2012-kern

29Citations

69Readers

Get full text

Abstract

An important aspect of the work of researchers as well as librarians is to manage collections of scientific literature. Social research networks, such as Mendeley and CiteULike, provide services that support this task. Meta-data plays an important role in providing services to retrieve and organise the articles. In such settings, meta-data is rarely explicitly provided, leading to the need for automatically extracting this valuable information. The TeamBeam algorithm analyses a scientific article and extracts structured meta-data, such as the title, journal name and abstract, as well as information about the article's authors (e.g. names, e-mail addresses, affiliations). The input of the algorithm is a set of blocks generated from the article text. A classification algorithm, which takes the sequence of the input into account, is then applied in two consecutive phases. In the evaluation of the algorithm, its performance is compared against two heuristics and three existing meta-data extraction systems. Three different data sets with varying characteristics are used to assess the quality of the extraction results. TeamBeam performs well under testing and compares favourably with existing approaches. © 2012 Roman Kern, Kris Jack, Maya Hristakeva, Michael Granitzer.

Author supplied keywords

Cite

CITATION STYLE

APA

Kern, R., Jack, K., Hristakeva, M., & Granitzer, M. (2012). Teambeam - meta-data extraction from scientific literature. D-Lib Magazine, 18(7–8). https://doi.org/10.1045/july2012-kern

Teambeam - meta-data extraction from scientific literature

Abstract

Author supplied keywords

Cite

Register to see more suggestions