Diverse Parallel Data Synthesis for Cross-Database Adaptation of Text-to-SQL Parsers

4Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Text-to-SQL parsers typically struggle with databases unseen during the train time. Adapting parsers to new databases is a challenging problem due to the lack of natural language queries in the new schemas. We present REFILL, a framework for synthesizing high-quality and textually diverse parallel datasets for adapting a Text-to-SQL parser to a target schema. REFILL learns to retrieve-and-edit text queries from the existing schemas and transfers them to the target schema. We show that retrieving diverse existing text, masking their schema-specific tokens, and refilling with tokens relevant to the target schema, leads to significantly more diverse text queries than achievable by standard SQL-to-Text generation methods. Through experiments spanning multiple databases, we demonstrate that fine-tuning parsers on datasets synthesized using REFILL consistently outperforms the prior data-augmentation methods.

Cite

CITATION STYLE

APA

Awasthi, A., Sathe, A., & Sarawagi, S. (2022). Diverse Parallel Data Synthesis for Cross-Database Adaptation of Text-to-SQL Parsers. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 11548–11562). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.794

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free