BioELECTRA:Pretrained Biomedical text Encoder using Discriminators

95Citations
Citations of this article
113Readers
Mendeley users who have this article in their library.

Abstract

Recent advancements in pretraining strategies in NLP have shown a significant improvement in the performance of models on various text mining tasks. In this paper, we introduce BioELECTRA, a biomedical domain-specific language encoder model that adapts ELECTRA (Clark et al., 2020) for the Biomedical domain. BioELECTRA outperforms the previous models and achieves state of the art (SOTA) on all the 13 datasets in BLURB benchmark and on all the 4 Clinical datasets from BLUE Benchmark across 7 NLP tasks. BioELECTRA pretrained on PubMed and PMC full text articles performs very well on Clinical datasets as well. BioELECTRA achieves new SOTA 86.34%(1.39% accuracy improvement) on MedNLI and 64% (2.98% accuracy improvement) on PubMedQA dataset.

Cite

CITATION STYLE

APA

Kanakarajan, K. R., Kundumani, B., & Sankarasubbu, M. (2021). BioELECTRA:Pretrained Biomedical text Encoder using Discriminators. In Proceedings of the 20th Workshop on Biomedical Language Processing, BioNLP 2021 (pp. 143–154). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.bionlp-1.16

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free