Weakly Supervised Pre-Training for Multi-Hop Retriever

6Citations
Citations of this article
60Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In multi-hop QA, answering complex questions entails iterative document retrieval for finding the missing entity of the question. The main steps of this process are sub-question detection, document retrieval for the sub-question, and generation of a new query for the final document retrieval. However, building a dataset that contains complex questions with sub-questions and their corresponding documents requires costly human annotation. To address the issue, we propose a new method for weakly supervised multi-hop retriever pretraining without human efforts. Our method includes 1) a pre-training task for generating vector representations of complex questions, 2) a scalable data generation method that produces the nested structure of question and sub-question as weak supervision for pre-training, and 3) a pre-training model structure based on dense encoders. We conduct experiments to compare the performance of our pre-trained retriever with several state-of-the-art models on end-to-end multi-hop QA as well as document retrieval. The experimental results show that our pre-trained retriever is effective and also robust on limited data and computational resources.

Cite

CITATION STYLE

APA

Seonwoo, Y., Lee, S. W., Kim, J. H., Ha, J. W., & Oh, A. (2021). Weakly Supervised Pre-Training for Multi-Hop Retriever. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 694–704). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.62

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free