Abstract
Parametric edit distance based classification has been applied to two significant problems in the bioinformatics area: biological sequence analysis (DNA, RNA, protein), and semantic relationship extraction from biomedical scientific literature. This method is based on the edit distance measure on sequences, with parametric costs for matching, mismatching, inserts, and deletes of letters. We present a proof that finding optimal parameter values for such classification based on training data is an NP-hard problem, which is an important claim to justify the use of heuristic methods for determining the best parameter values. © 2009 ACM.
Cite
CITATION STYLE
Kešelj, V., Liu, H., Zeh, N., Blouin, C., & Whidden, C. (2009). Finding optimal parameters for edit distance based sequence classification is NP-hard. In Proceedings of the KDD-09 Workshop on Statistical and Relational Learning in Bioinformatics, StReBio ’09 (pp. 17–21). https://doi.org/10.1145/1562090.1562094
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.