We present an algorithm for the optimal alignment of sequences to genome graphs. It works by phrasing the edit distance minimization task as finding a shortest path on an implicit alignment graph. To find a shortest path, we instantiate the A* paradigm with a novel domain-specific heuristic function that accounts for the upcoming subsequence in the query to be aligned, resulting in a provably optimal alignment algorithm called AStarix. Experimental evaluation of AStarix shows that it is 1–2 orders of magnitude faster than state-of-the-art optimal algorithms on the task of aligning Illumina reads to reference genome graphs. Implementations and evaluations are available at https://github.com/eth-sri/astarix.
CITATION STYLE
Ivanov, P., Bichsel, B., Mustafa, H., Kahles, A., Rätsch, G., & Vechev, M. (2020). AStarix: Fast and optimal sequence-to-graph alignment. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12074 LNBI, pp. 104–119). Springer. https://doi.org/10.1007/978-3-030-45257-5_7
Mendeley helps you to discover research relevant for your work.