VeriGen: A Large Language Model for Verilog Code Generation

Shailja Thakur; Baleegh Ahmad; Hammond Pearce; Benjamin Tan; Brendan Dolan-Gavitt; Ramesh Karri; Siddharth Garg

Journal ArticleOPEN ACCESS

VeriGen: A Large Language Model for Verilog Code Generation

ACM Transactions on Design Automation of Electronic Systems (2024) 29(3)

DOI: 10.1145/3643681

26Citations

48Readers

Get full text

Abstract

In this study, we explore the capability of Large Language Models (LLMs) to automate hardware design by automatically completing partial Verilog code, a common language for designing and modeling digital systems. We fine-tune pre-existing LLMs on Verilog datasets compiled from GitHub and Verilog textbooks. We evaluate the functional correctness of the generated Verilog code using a specially designed test suite, featuring a custom problem set and testing benches. Here, our fine-tuned open-source CodeGen-16B model outperforms the commercial state-of-the-art GPT-3.5-turbo model with a 1.1% overall increase. Upon testing with a more diverse and complex problem set, we find that the fine-tuned model shows competitive performance against state-of-the-art gpt-3.5-turbo, excelling in certain scenarios. Notably, it demonstrates a 41% improvement in generating syntactically correct Verilog code across various problem categories compared to its pre-trained counterpart, highlighting the potential of smaller, in-house LLMs in hardware design automation. We release our training/evaluation scripts and LLM checkpoints as open-source contributions.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Thakur, S., Ahmad, B., Pearce, H., Tan, B., Dolan-Gavitt, B., Karri, R., & Garg, S. (2024). VeriGen: A Large Language Model for Verilog Code Generation. ACM Transactions on Design Automation of Electronic Systems, 29(3). https://doi.org/10.1145/3643681

Readers over time

Readers' Seniority

Professor / Associate Prof. 7

33%

PhD / Post grad / Masters / Doc 5

24%

Researcher 5

24%

Lecturer / Post doc 4

19%

Readers' Discipline

Computer Science 12

63%

Business, Management and Accounting 5

26%

Arts and Humanities 1

Engineering 1

Article Metrics

Mentions

News Mentions: 1

View details >

VeriGen: A Large Language Model for Verilog Code Generation

Abstract

Author supplied keywords

References Powered by Scopus

Chisel: Constructing hardware in a Scala embedded language

DeepSpeed: System Optimizations Enable Training Deep Learning Models with over 100 Billion Parameters

Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions

Cited by Powered by Scopus

LLM for SoC Security: A Paradigm Shift

Hardware Trojan Dataset of RISC-V and Web3 Generated with ChatGPT-4

PyHDL-Eval: An LLM Evaluation Framework for Hardware Design Using Python-Embedded DSLs

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline

Article Metrics