Distant Supervision for Multi-Stage Fine-Tuning in Retrieval-Based Question Answering

Yuqing Xie; Wei Yang; Luchen Tan; Kun Xiong; Nicholas Jing Yuan; Baoxing Huai; Ming Li; Jimmy Lin

Conference ProceedingsOPEN ACCESS

Distant Supervision for Multi-Stage Fine-Tuning in Retrieval-Based Question Answering

The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020 (2020) 2934-2940

DOI: 10.1145/3366423.3380060

21Citations

30Readers

Get full text

Abstract

We tackle the problem of question answering directly on a large document collection, combining simple "bag of words" passage retrieval with a BERT-based reader for extracting answer spans. In the context of this architecture, we present a data augmentation technique using distant supervision to automatically annotate paragraphs as either positive or negative examples to supplement existing training data, which are then used together to fine-tune BERT. We explore a number of details that are critical to achieving high accuracy in this setup: the proper sequencing of different datasets during fine-tuning, the balance between "difficult" vs. "easy" examples, and different approaches to gathering negative examples. Experimental results show that, with the appropriate settings, we can achieve large gains in effectiveness on two English and two Chinese QA datasets. We are able to achieve results at or near the state of the art without any modeling advances, which once again affirms the cliché "there's no data like more data".

Author supplied keywords

Cite

CITATION STYLE

APA

Xie, Y., Yang, W., Tan, L., Xiong, K., Yuan, N. J., Huai, B., … Lin, J. (2020). Distant Supervision for Multi-Stage Fine-Tuning in Retrieval-Based Question Answering. In The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020 (pp. 2934–2940). Association for Computing Machinery, Inc. https://doi.org/10.1145/3366423.3380060

Distant Supervision for Multi-Stage Fine-Tuning in Retrieval-Based Question Answering

Abstract

Author supplied keywords

Cite

Register to see more suggestions