Ranking LLM-Generated Loop Invariants for Program Verification

Saikat Chakraborty; Shuvendu K. Lahiri; Sarah Fakhoury; Madanlal Musuvathi; Akash Lal; Aseem Rastogi; Aditya Senthilnathan; Rahul Sharma; Nikhil Swamy

Conference Proceedings

Ranking LLM-Generated Loop Invariants for Program Verification

Findings of the Association for Computational Linguistics: EMNLP 2023 (2023) 9164-9175

DOI: 10.18653/v1/2023.findings-emnlp.614

14Citations

22Readers

Get full text

Abstract

Synthesizing inductive loop invariants is fundamental to automating program verification. In this work, we observe that Large Language Models (such as gpt-3.5 or gpt-4) are capable of synthesizing loop invariants for a class of programs in a 0-shot setting, yet require several samples to generate the correct invariants. This can lead to a large number of calls to a program verifier to establish an invariant. To address this issue, we propose a re-ranking approach for the generated results of LLMs. We have designed a ranker that can distinguish between correct inductive invariants and incorrect attempts based on the problem definition. The ranker is optimized as a contrastive ranker. Experimental results demonstrate that this re-ranking mechanism significantly improves the ranking of correct invariants among the generated candidates, leading to a notable reduction in the number of calls to a verifier.

Cite

CITATION STYLE

APA

Chakraborty, S., Lahiri, S. K., Fakhoury, S., Musuvathi, M., Lal, A., Rastogi, A., … Swamy, N. (2023). Ranking LLM-Generated Loop Invariants for Program Verification. In Findings of the Association for Computational Linguistics: EMNLP 2023 (pp. 9164–9175). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-emnlp.614

Ranking LLM-Generated Loop Invariants for Program Verification

Abstract

Cite

Register to see more suggestions