Abstract
Motivation The three-dimensional protein tertiary structure alignment is a fundamental problem that seeks insights into functions and evolution. Previous structure alignment algorithms have adopted the sequential assumption and used dynamic programming solvers. However, many distantly related structures exhibit non-sequential similarities, and non-sequential alignment tools are less efficient and accurate than sequential ones. In this paper, we formulate the non-sequential alignment as the Entropy-regularized Partial Linear Sum Assignment Problem (epLSAP) and propose a solver based on Sinkhorn algorithms, referred to as epLSAP-Align. Results Compared with existing non-sequential alignment solvers, our epLSAP-Align can explicitly model the gap penalty, efficiently achieve global optimality and balance coverage and fidelity. We show that epLSAP-Align can be easily integrated into the existing frameworks, such as TM-align and MICAN, resulting in the non-sequential alignment tool epLSAP-TM and epLSAP-MICAN, respectively. Both epLSAP-TM and epLSAP-MICAN achieve better performance than the existing non-sequential alignment tools in terms of biologically meaningful structure overlaps on two sequential alignment test sets MALIDUP and MALISAM, and four non-sequential alignment test sets MALIDUP-ns, MALISAM-ns, 64-difficult-case and RIPC datasets. Also, compared with the most recent non-sequential alignment tool USalign2, our epLSAP-TM is at least 22% faster under the same setting.
Cite
CITATION STYLE
Zhang, X., Chen, Z., Li, J., Luo, Q., Wu, L., & Yu, W. (2025). epLSAP-Align: a non-sequential protein structural alignment solver with entropy-regularized partial linear sum assignment problem formulation. Bioinformatics, 41(6). https://doi.org/10.1093/bioinformatics/btaf309
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.