Analysis of the relationship between a large number of sequences is a significant problem in many different applications such as business processes, sport, voting, weblogs, etc. Generally, studying relationship is based on clustering the sequences and creating a network of relationships. Interpretation and validation of such results require a domain expert knowledge. In this paper, we propose a methodology which is able to provide an insight into the sequence dataset prior to the analysis and independently of a domain expert. Such information may be used to direct the analysis, identify sequences of interest and expose special patterns in the sequences. This methodology leverages tools such as transition matrix, Shannon entropy, complexity index, pairwise state occurrence, etc. Due to the low computational complexity of these methods, this approach is possible to use on the large datasets and help to identify the subsets of such datasets which should be inspected closer with more sophisticated tools. Ability to extract relevant information using the aforementioned tools was validated on two datasets, one from business processes simulation and the other from robot soccer game simulation.
CITATION STYLE
Martinovič, T., Janurová, K., Martinovič, J., Slaninová, K., & Svatoň, V. (2019). Sequence analysis for relationship pattern extraction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11703 LNCS, pp. 347–358). Springer Verlag. https://doi.org/10.1007/978-3-030-28957-7_29
Mendeley helps you to discover research relevant for your work.