The biological literature is rich with sentences that describe causal relations. Methods that automatically extract such sentences can help biologists to synthesize the literature and even discover latent relations that had not been articulated explicitly. Current methods for extracting causal sentences are based on either machine learning or a predefined database of causal terms. Machine learning approaches require a large set of labeled training data and can be susceptible to noise. Methods based on predefined databases are limited by the quality of their curation and are unable to capture new concepts or mistakes in the input. We address these challenges by adapting and improving a method designed for a seemingly unrelated problem: finding alignments between genomic sequences. This paper presents a novel method for extracting causal relations from text by aligning the part-of-speech representations of an input set with that of known causal sentences. Our experiments show that when applied to the task of finding causal sentences in biological literature, our method improves on the accuracy of other methods in a computationally efficient manner.
CITATION STYLE
Wood, J., Matiasz, N., Silva, A., Hsu, W., Abyzov, A., & Wang, W. (2022). OpBerg: Discovering Causal Sentences Using Optimal Alignments. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13428 LNCS, pp. 17–30). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-12670-3_2
Mendeley helps you to discover research relevant for your work.