A framework for information extraction from tables in biomedical literature

Nikola Milosevic; Cassie Gregson; Robert Hernandez; Goran Nenadic

Journal ArticleOPEN ACCESS

A framework for information extraction from tables in biomedical literature

International Journal on Document Analysis and Recognition (2019) 22(1) 55-78

DOI: 10.1007/s10032-019-00317-0

39Citations

92Readers

Abstract

The scientific literature is growing exponentially, and professionals are no more able to cope with the current amount of publications. Text mining provided in the past methods to retrieve and extract information from text; however, most of these approaches ignored tables and figures. The research done in mining table data still does not have an integrated approach for mining that would consider all complexities and challenges of a table. Our research is examining the methods for extracting numerical (number of patients, age, gender distribution) and textual (adverse reactions) information from tables in the clinical literature. We present a requirement analysis template and an integral methodology for information extraction from tables in clinical domain that contains 7 steps: (1) table detection, (2) functional processing, (3) structural processing, (4) semantic tagging, (5) pragmatic processing, (6) cell selection and (7) syntactic processing and extraction. Our approach performed with the F-measure ranged between 82 and 92%, depending on the variable, task and its complexity.

Author supplied keywords

Cite

CITATION STYLE

APA

Milosevic, N., Gregson, C., Hernandez, R., & Nenadic, G. (2019). A framework for information extraction from tables in biomedical literature. International Journal on Document Analysis and Recognition, 22(1), 55–78. https://doi.org/10.1007/s10032-019-00317-0

A framework for information extraction from tables in biomedical literature

Abstract

Author supplied keywords

Cite

Register to see more suggestions