Python Code Generation by Asking Clarification Questions

Haau Sing Li; Mohsen Mesgar; André F.T. Martins; Iryna Gurevych

Conference ProceedingsOPEN ACCESS

Python Code Generation by Asking Clarification Questions

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2023) 1 14287-14306

DOI: 10.18653/v1/2023.acl-long.799

4Citations

17Readers

Abstract

Code generation from text requires understanding the user's intent from a natural language description and generating an executable code snippet that satisfies this intent. While recent pretrained language models demonstrate remarkable performance for this task, these models fail when the given natural language description is under-specified. In this work, we introduce a novel and more realistic setup for this task. We hypothesize that the underspecification of a natural language description can be resolved by asking clarification questions. Therefore, we collect and introduce a new dataset named CodeClarQA containing pairs of natural language descriptions and code with created synthetic clarification questions and answers. The empirical results of our evaluation of pretrained language model performance on code generation show that clarifications result in more precisely generated code, as shown by the substantial improvement of model performance in all evaluation metrics. Alongside this, our task and dataset introduce new challenges to the community, including when and what clarification questions should be asked. Our code and dataset are available on GitHub.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Li, H. S., Mesgar, M., Martins, A. F. T., & Gurevych, I. (2023). Python Code Generation by Asking Clarification Questions. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 14287–14306). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-long.799

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 3

60%

Lecturer / Post doc 1

20%

Researcher 1

20%

Readers' Discipline

Computer Science 9

90%

Medicine and Dentistry 1

10%

Python Code Generation by Asking Clarification Questions

Abstract

References Powered by Scopus

YAKE! Keyword extraction from single documents using multiple local features

Unified Pre-training for Program Understanding and Generation

Asking clarifying questions in open-domain information-seeking conversations

Cited by Powered by Scopus

Transformers in source code generation: A comprehensive survey

Advances in Deep Learning-Based Program Synthesis

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline