We describe a weakly-supervised method for training deep learning models for the task of ad-hoc document retrieval. Our method is based on generative and discriminative models that are trained using weak-supervision based solely on the documents in the corpus. We present an end-to-end retrieval system that starts with traditional information retrieval methods, followed by two deep learning re-rankers. We evaluate our method on three different datasets: a COVID-19 related scientific literature dataset and two news datasets. We show that our method outperforms state-of-the-art methods; this without the need for the expensive process of manually labeling data.
CITATION STYLE
Mass, Y., & Roitman, H. (2020). Ad-hoc document retrieval using weak-supervision with BERT and GPT2. In EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 4191–4197). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.emnlp-main.343
Mendeley helps you to discover research relevant for your work.