Background: Since tumor often has a high level of intra-tumor heterogeneity, multiple tumor samples from the same patient at different locations or different time points are often sequenced to study tumor intra-heterogeneity or tumor evolution. In virus-related tumors such as human papillomavirus- and Hepatitis B Virus-related tumors, virus genome integrations can be critical driving events. It is thus important to investigate the integration sites of the virus genomes. Currently, a few algorithms for detecting virus integration sites based on high-throughput sequencing have been developed, but their insufficient performance in their sensitivity, specificity and computational complexity hinders their applications in multiple related tumor sequencing. Results: We develop VirTect for detecting virus integration sites simultaneously from multiple related-sample data. This algorithm is mainly based on the joint analysis of short reads spanning breakpoints of integration sites from multiple samples. To achieve high specificity and breakpoint accuracy, a local precise sandwich alignment algorithm is used. Simulation and real data analyses show that, compared with other algorithms, VirTect is significantly more sensitive and has a similar or lower false discovery rate. Conclusions: VirTect can provide more accurate breakpoint position and is computationally much more efficient in terms both memory requirement and computational time.
CITATION STYLE
Xia, Y., Liu, Y., Deng, M., & Xi, R. (2019). Detecting virus integration sites based on multiple related sequencing data by VirTect. BMC Medical Genomics, 12. https://doi.org/10.1186/s12920-018-0461-8
Mendeley helps you to discover research relevant for your work.