Speeding up HMM decoding and training by exploiting sequence repetitions

Shay Mozes; Oren Weimann; Michal Ziv-Ukelson

Conference Proceedings

Speeding up HMM decoding and training by exploiting sequence repetitions

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4580 LNCS 4-15

DOI: 10.1007/978-3-540-73437-6_4

17Citations

2Readers

Get full text

Abstract

We present a method to speed up the dynamic program algorithms used for solving the HMM decoding and training problems for discrete time-independent HMMs. We discuss the application of our method to Viterbi's decoding and training algorithms [21], as well as to the forward-backward and Baum-Welch [4] algorithms. Our approach is based on identifying repeated substrings in the observed input sequence. We describe three algorithms based alternatively on byte pair encoding (BPE) [19], run length encoding (RLE) and Lempel-Ziv (LZ78) parsing [22]. Compared to Viterbi's algorithm, we achieve a speedup of Ω(r) using BPE, a speedup of Ω(r/log r) using RLE, and a speedup of Ω(log n/k) using LZ78, where k is the number of hidden states, n is the length of the observed sequence and r is its compression ratio (under each compression scheme). Our experimental results demonstrate that our new algorithms are indeed faster in practice. Furthermore, unlike Viterbi's algorithm, our algorithms are highly parallelizable. © Springer-Verlag Berlin Heidelberg 2007.

Author supplied keywords

Cite

CITATION STYLE

APA

Mozes, S., Weimann, O., & Ziv-Ukelson, M. (2007). Speeding up HMM decoding and training by exploiting sequence repetitions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4580 LNCS, pp. 4–15). Springer Verlag. https://doi.org/10.1007/978-3-540-73437-6_4

Speeding up HMM decoding and training by exploiting sequence repetitions

Abstract

Author supplied keywords

Cite

Register to see more suggestions