Abstract
Communication by electronic mail (e-mail), once extravagant, is now the usual way to exchange data and information. Widely accepted by Internet users, business and governments, it is claimed to be the key part of the e-revolution. E-mail systems have been successfully implemented in almost all computer-aided domains of human interest, providing efficient, effective and permanent mechanisms of transmission. However, to date, the capability to exhibit an ordered list (sequence) of e-mail message senders and recipients, with the respective duration time between receiving and answering is still lacking. To fill this gap, in this paper we introduce the SOMF algorithm for mining such sequences from server log data. We specified a three-stage approach to comprehensively target the problem. The first stage concerns a data preparation task in order to assemble the input for the algorithm. The second, known as data mining, is the automatic analysis of data input performed in an unsupervised model by the SOMF algorithm. The third embraces output (knowledge) visualization, interpretation and evaluation. The given case study is based on the log data from an operational STMP server. By design, this simplified example brings about a better understanding of the solution, indicating one of its potential applications to identify and eliminate deadlocks in the realization of business processes. We also tested the efficiency of the implementation of the algorithm in five independent experiments on seven datasets, ranging in size. The results show that mining even 1 million rows is performed in approximately less than 6 minutes.
Cite
CITATION STYLE
Weichbroth, P. (2018). Mining e-mail message sequences from log data. In Proceedings of the 2018 Federated Conference on Computer Science and Information Systems, FedCSIS 2018 (pp. 855–858). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.15439/2018F325
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.