Multiple sequence alignment (MSA) is a central problem in computational biology. It is well known that MSA can be formulated as a shortest path problem and solved using heuristic search, but the memory requirement of A* makes it impractical for all but the smallest problems. Partial Expansion A* (PEA*) reduces the memory requirement of A* by generating only the most promising successor nodes. However, even PEA* exhausts available memory on many problems. Another alternative is Iterative Deepening Dynamic Programming, which uses an uninformed search order but stores only the nodes along the search frontier. However, it too cannot scale to the largest problems. In this paper, we propose storing nodes on cheap and plentiful secondary storage. We present a new general-purpose algorithm, Parallel External PEA* (PE2A*), that combines PEA* with Delayed Duplicate Detection to take advantage of external memory and multiple processors to solve large MSA problems. In our experiments, PE2A* is the first algorithm capable of solving the entire Reference Set 1 of the standard BAliBASE benchmark using a biologically accurate cost function. This work suggests that external best-first search can effectively use heuristic information to surpass methods that rely on uninformed search orders. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
CITATION STYLE
Hatem, M., & Ruml, W. (2013). External memory best-first search for multiple sequence alignment. In Proceedings of the 27th AAAI Conference on Artificial Intelligence, AAAI 2013 (pp. 409–416). https://doi.org/10.1609/aaai.v27i1.8626
Mendeley helps you to discover research relevant for your work.