Imitating manual curation of text-mined facts in biomedicine

27Citations
Citations of this article
60Readers
Mendeley users who have this article in their library.

Abstract

Text-mining algorithms make mistakes in extracting facts from natural-language texts. In biomedical applications, which rely on use of text-mined data, it is critical to assess the quality (the probability that the message is correctly extracted) of individual facts - to resolve data conflicts and inconsistencies. Using a large set of almost 100,000 manually produced evaluations (most facts were independently reviewed more than once, producing independent evaluations), we implemented and tested a collection of algorithms that mimic human evaluation of facts provided by an automated information-extraction system. The performance of our best automated classifiers closely approached that of our human evaluators (ROC score close to 0.95). Our hypothesis is that, were we to use a larger number of human experts to evaluate any given sentence, we could implement an artificial-intelligence curator that would perform the classification job at least as accurately as an average individual human evaluator. We illustrated our analysis by visualizing the predicted accuracy of the text-mined relations involving the term cocaine. © 2006 Rodriguez-Esteban et al.

Cite

CITATION STYLE

APA

Rodriguez-Esteban, R., Iossifov, I., & Rzhetsky, A. (2006). Imitating manual curation of text-mined facts in biomedicine. PLoS Computational Biology, 2(9), 1031–1044. https://doi.org/10.1371/journal.pcbi.0020118

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free