Finding optimal parameters for edit distance based sequence classification is NP-hard

1Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Parametric edit distance based classification has been applied to two significant problems in the bioinformatics area: biological sequence analysis (DNA, RNA, protein), and semantic relationship extraction from biomedical scientific literature. This method is based on the edit distance measure on sequences, with parametric costs for matching, mismatching, inserts, and deletes of letters. We present a proof that finding optimal parameter values for such classification based on training data is an NP-hard problem, which is an important claim to justify the use of heuristic methods for determining the best parameter values. © 2009 ACM.

Cite

CITATION STYLE

APA

Kešelj, V., Liu, H., Zeh, N., Blouin, C., & Whidden, C. (2009). Finding optimal parameters for edit distance based sequence classification is NP-hard. In Proceedings of the KDD-09 Workshop on Statistical and Relational Learning in Bioinformatics, StReBio ’09 (pp. 17–21). https://doi.org/10.1145/1562090.1562094

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free