Abstract
Accurate prediction of virus–host interactions is critical for understanding viral ecology and developing applications like phage therapy. However, the growing number of computational tools has created a complex landscape, making direct performance comparison challenging due to inconsistent benchmarks and varying usability. Here, we provide a systematic review and a rigorous benchmark of 27 virus–host prediction tools. We formulate the host prediction task into two primary frameworks—link prediction and multi-class classification—and construct two benchmark datasets to evaluate tool performance in distinct scenarios: a database-centric dataset (RefSeq-VHDB) and a metagenomic discovery dataset (MetaHiC-VHDB). Our results reveal that no single tool is universally optimal. Performance is highly context-dependent, with tools like CHERRY and iPHoP demonstrating robust, broad applicability, while others, such as RaFAH and PHIST, excel in specific contexts. We further identify a critical trade-off between predictive accuracy, prediction rate, and computational cost. This work serves as a practical guide for researchers and establishes a standardized benchmark to drive future innovation in deciphering complex virus–host interactions.
Author supplied keywords
Cite
CITATION STYLE
Shang, J., Peng, C., Guan, J., Cai, D., Wang, D., & Sun, Y. (2025, November 1). From genomic signals to prediction tools: a critical feature analysis and rigorous benchmark for phage–host prediction. Briefings in Bioinformatics. Oxford University Press. https://doi.org/10.1093/bib/bbaf626
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.