Determining protein sequence similarity is an important task for protein classification and homology detection. Typically this may be done using sequence alignment algorithms, yet fast and accurate alignment-free kernel based classifiers exist. Viewing sequences as a "bag of words", we test a simple weighted string kernel, investigating the effects of k-mer length, sequence length and choice of weighting. We also extend the kernel to operate on the k-mer frequency representation of a sequence rather than the "bag of words" representation. © Springer-Verlag Berlin Heidelberg 2005.
CITATION STYLE
Spalding, J. D., & Hoyle, D. C. (2005). Accuracy of string kernels for protein sequence classification. In Lecture Notes in Computer Science (Vol. 3686, pp. 454–460). Springer Verlag. https://doi.org/10.1007/11551188_49
Mendeley helps you to discover research relevant for your work.