Abstract
In this paper, we present VeGA, a lightweight, decoder-only Transformer model for de novo molecular design. VeGA balances a streamlined architecture with robust generative performance, making it highly efficient and well-suited for resource-limited environments. Pretrained on ChEMBL, the model demonstrates strong performance against cutting-edge approaches, achieving high validity (96.6%) and novelty (93.6%), ranking among the top performers in the MOSES benchmark. The model’s main strength lies in target-specific fine-tuning under challenging, data-scarce conditions. In a rigorous, leakage-safe evaluation across five pharmacological targets against state-of-the-art models (S4, R4), VeGA proved to be a powerful “explorer” that consistently generated the most novel molecules while maintaining a strong balance between discovery performance and chemical realism. This capability is particularly evident in the extremely low-data scenario of mTORC1, where VeGA achieved top-tier results. As a case study, VeGA was applied to the Farnesoid X receptor (FXR), generating novel compounds with validated binding potential through molecular docking. The model is available as an open-access platform to support medicinal chemists in designing novel, target-specific chemotypes (https://github.com/piedelre93/VeGA-for-de-novo-design). Future developments will focus on incorporating conditioning strategies for multiobjective optimization and integrating experimental in vitro validation workflows.
Cite
CITATION STYLE
Delre, P., & Lavecchia, A. (2025). VeGA: A Versatile Generative Architecture for Bioactive Molecules across Multiple Therapeutic Targets. Journal of Chemical Information and Modeling, 65(20), 10918–10931. https://doi.org/10.1021/acs.jcim.5c01606
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.