The Natural Language Processing (NLP) community has been using crowd-sourcing techniques to create benchmark datasets such as General Language Understanding and Evaluation (GLUE) for training modern Language Models (LMs) such as BERT. GLUE tasks measure the reliability scores using inter-annotator metrics - Cohen's Kappa (k). However, the reliability aspect of LMs has often been overlooked. To counter this problem, we explore a knowledge-guided LM ensembling approach that leverages reinforcement learning to integrate knowledge from ConceptNet and Wikipedia as knowledge graph embeddings. This approach mimics human annotators resorting to external knowledge to compensate for information deficits in the datasets. Across nine GLUE datasets, our research shows that ensembling strengthens reliability and accuracy scores, outperforming state-of-the-art.
CITATION STYLE
Tyagi, N., Sarkar, S., & Gaur, M. (2023). Leveraging Knowledge and Reinforcement Learning for Enhanced Reliability of Language Models. In International Conference on Information and Knowledge Management, Proceedings (pp. 4320–4324). Association for Computing Machinery. https://doi.org/10.1145/3583780.3615273
Mendeley helps you to discover research relevant for your work.