Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer

14Citations
Citations of this article
53Readers
Mendeley users who have this article in their library.

Abstract

Systems for knowledge-intensive tasks such as open-domain question answering (QA) usually consist of two stages: efficient retrieval of relevant documents from a large corpus and detailed reading of the selected documents to generate answers. Retrievers and readers are usually modeled separately, which necessitates a cumbersome implementation and is hard to train and adapt in an end-to-end fashion. In this paper, we revisit this design and eschew the separate architecture and training in favor of a single Transformer that performs Retrieval as Attention (ReAtt), and end-to-end training solely based on supervision from the end QA task. We demonstrate for the first time that a single model trained end-to-end can achieve both competitive retrieval and QA performance, matching or slightly outperforming state-of-the-art separately trained retrievers and readers. Moreover, end-to-end adaptation significantly boosts its performance on out-of-domain datasets in both supervised and unsupervised settings, making our model a simple and adaptable solution for knowledge-intensive tasks. Code and models are available at https://github.com/jzbjyb/ReAtt.

Cite

CITATION STYLE

APA

Jiang, Z., Gao, L., Araki, J., Ding, H., Wang, Z., Callan, J., & Neubig, G. (2022). Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 2336–2349). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.149

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free