SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising

Kuan Xu; Yongbo Wang; Yongliang Wang; Zihao Wang; Zujie Wen; Yang Dong

Conference ProceedingsOPEN ACCESS

SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising

Findings of the Association for Computational Linguistics: NAACL 2022 - Findings (2022) 1845-1853

DOI: 10.18653/v1/2022.findings-naacl.141

19Citations

80Readers

Abstract

On the WikiSQL1 benchmark, most methods tackle the challenge of text-to-SQL with predefined sketch slots and build sophisticated sub-tasks to fill these slots. Though achieving promising results, these methods suffer from over-complex model structure. In this paper, we present a simple yet effective approach that enables auto-regressive sequenceto- sequence model to robust text-to-SQL generation. Instead of formulating the task of text-to-SQL as slot-filling, we propose to train sequence-to-sequence model with Schemaaware Denoising (SeaD), which consists of two denoising objectives that train model to either recover input or predict output from two novel erosion and shuffle noises. These modelagnostic denoising objectives act as the auxiliary tasks for structural data modeling during sequence-to-sequence generation. In addition, we propose a clause-sensitive execution guided (EG) decoding strategy to overcome the limitation of EG decoding for generative model. The experiments show that the proposed method improves the performance of sequence-to-sequence model in both schema linking and grammar correctness and establishes new state-of-the-art on WikiSQL benchmark. Our work indicates that the capacity of sequence-to-sequence model for text-to-SQL may have been under-estimated and could be enhanced by specialized denoising task.

Cite

CITATION STYLE

APA

Xu, K., Wang, Y., Wang, Y., Wang, Z., Wen, Z., & Dong, Y. (2022). SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising. In Findings of the Association for Computational Linguistics: NAACL 2022 - Findings (pp. 1845–1853). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-naacl.141

SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising

Abstract

Cite

Register to see more suggestions