Learning k-occurrence regular expressions with interleaving

6Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Since lacking valid schemas is a critical problem for XML and present research on interleaving for XML is also quite insufficient, in this paper we focus on the inference of XML schemas with interleaving. Previous researches have shown that the essential task in schema learning is inferring regular expressions from a set of given samples. Presently, the most powerful model to learn XML schemas is the k-occurrence regular expressions (k-OREs for short). However, there have been no algorithms that can learn k-OREs with interleaving. Therefore, we propose an entire framework which can support both k-OREs and interleaving. To the best of our knowledge, our work is the first to address these two inference problems at the same time. We first defined a new subclass of regular expressions named k-OIREs, and developed an inference algorithm iKOIRE to learn k-OIRE based on genetic algorithm and maximum independent set (MIS). We further conducted a series of experiments on large-scale real datasets, and evaluated the effectiveness of our work compared with both ongoing learning algorithms in academia and industrial tools in real world. The results reveal the high practicability and outstanding performance of our work, and indicate its promising prospects in application.

Cite

CITATION STYLE

APA

Li, Y., Zhang, X., Cao, J., Chen, H., & Gao, C. (2019). Learning k-occurrence regular expressions with interleaving. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11447 LNCS, pp. 70–85). Springer Verlag. https://doi.org/10.1007/978-3-030-18579-4_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free