Tree string path subsequences automaton and its use for indexing XML documents

2Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The theory of indexing texts is well-researched, which does not hold for indexing other data structures, such as trees for example. In this paper a simple method of indexing a tree for subsequences of string paths in the tree by finite automaton is presented. The use of the index is shown on indexing XML documents for XPath descendant-orself axis inspired queries. Given a subject tree T with n nodes, the tree is preprocessed and an index, which is a directed acyclic subsequence graph for a set of strings, is constructed. The searching phase uses the index, reads an input string path subsequence Q inspired by the specific XPath query of size m and computes the list of positions of all occurrences of Q in the tree T . The searching is performed in time O(m) and does not depend on n. Although the number of distinct valid queries is O(2n), the size of the index is O(hk), where h is the height of the tree T and k is the number of its leaves. Moreover, we discuss that in the case of indexing a common XML document the size of the index is even smaller O(h.2k).

Cite

CITATION STYLE

APA

Šestáková, E., & Janoušek, J. (2015). Tree string path subsequences automaton and its use for indexing XML documents. In Communications in Computer and Information Science (Vol. 563, pp. 171–181). Springer Verlag. https://doi.org/10.1007/978-3-319-27653-3_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free