Online suffix tree construction for streaming sequences

1Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this study, we present an online suffix tree construction approach where multiple sequences are indexed by a single suffix tree. Due to the poor memory locality and high space consumption, online suffix tree construction on disk is a striving process. Even more, performance of the construction suffers when alphabet size is large. In order to overcome these difficulties, first, we present a space efficient node representation approach to be used in Ukkonen suffix tree construction algorithm. Next, we show that performance can be increased through incorporating semantic knowledge such as utilizing the frequently used letters of an alphabet. In particular, we estimate the frequently accessed nodes of the tree and introduce a sequence insertion strategy into the tree. As a result, we can speed up accessing to the frequently accessed nodes. Finally, we analyze the contribution of buffering strategies and page sizes on performance and perform detailed tests. We run a series of experimentation under various buffering strategies and page sizes. Experimental results showed that our approach outperforms existing ones. © 2008 Springer-Verlag.

Cite

CITATION STYLE

APA

Ozcan, G., & Alpkocak, A. (2008). Online suffix tree construction for streaming sequences. In Communications in Computer and Information Science (Vol. 6 CCIS, pp. 69–81). https://doi.org/10.1007/978-3-540-89985-3_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free