Learning k-occurrence regular expressions with interleaving

Yeting Li; Xiaolan Zhang; Jialun Cao; Haiming Chen; Chong Gao

Conference Proceedings

Learning k-occurrence regular expressions with interleaving

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11447 LNCS 70-85

DOI: 10.1007/978-3-030-18579-4_5

6Citations

3Readers

Get full text

Abstract

Since lacking valid schemas is a critical problem for XML and present research on interleaving for XML is also quite insufficient, in this paper we focus on the inference of XML schemas with interleaving. Previous researches have shown that the essential task in schema learning is inferring regular expressions from a set of given samples. Presently, the most powerful model to learn XML schemas is the k-occurrence regular expressions (k-OREs for short). However, there have been no algorithms that can learn k-OREs with interleaving. Therefore, we propose an entire framework which can support both k-OREs and interleaving. To the best of our knowledge, our work is the first to address these two inference problems at the same time. We first defined a new subclass of regular expressions named k-OIREs, and developed an inference algorithm iKOIRE to learn k-OIRE based on genetic algorithm and maximum independent set (MIS). We further conducted a series of experiments on large-scale real datasets, and evaluated the effectiveness of our work compared with both ongoing learning algorithms in academia and industrial tools in real world. The results reveal the high practicability and outstanding performance of our work, and indicate its promising prospects in application.

Author supplied keywords

Cite

CITATION STYLE

APA

Li, Y., Zhang, X., Cao, J., Chen, H., & Gao, C. (2019). Learning k-occurrence regular expressions with interleaving. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11447 LNCS, pp. 70–85). Springer Verlag. https://doi.org/10.1007/978-3-030-18579-4_5

Learning k-occurrence regular expressions with interleaving

Abstract

Author supplied keywords

Cite

Register to see more suggestions