Sign up & Download
Sign in

Using classification methods to label tasks in process mining

by Scott Buffett, Liqiang Geng
Journal of Software Maintenance and Evolution Research and Practice (2010)

Abstract

We investigate a method designed to improve the accuracy of process mining in scenarios where the identification of task labels for log events is uncertain. Such situations are prevalent in business processes where events consist of communications between people, such as email messages. We examine how the accuracy of an independent task identifier, such as a classification or clustering engine, can be improved by examining the currently mined process model. First, a classification scheme based on identifying the keywords in each message is presented to provide an initial labeling. We then demonstrate how these labels can be refined by considering the likelihood that the event represents a particular task as obtained via an analysis of the current representation of the process model. This process is then repeated a number of times until the model is sufficiently refined. Results show that both keyword classification and the current process model analysis can be significantly effective on their own, and when combined have the potential to correct virtually all errors when noise is low (less than 20%), and can reduce the error rate by about 85% when noise is in the 30-40% range. Copyright 2010 Crown in the right of Canada.

Cite this document (BETA)

Page 1
hidden

Using classification methods to label tasks in process mining

NRC Publications Archive (NPArC)
Archives des publications du CNRC (NPArC)
Publisher’s version / la version de l'éditeur:
Journal of Software Maintenance and Evolution : Research and Practice, 22, 6-7,
pp. 497-517, 2010-09-01
Using Classification Methods to Label Tasks in Process Mining
Buffett, Scott; Geng, Liqiang
Contact us / Contactez nous: nparc.cisti@nrc-cnrc.gc.ca.
http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/jsp/nparc_cp.jsp?lang=fr
L’accès à ce site Web et l’utilisation de son contenu sont assujettis aux conditions présentées dans le site
Web page / page Web
http://dx.doi.org/10.1002/smr.463
http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/ctrl?action=rtdoc&an=15188891&lang=en
http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/ctrl?action=rtdoc&an=15188891&lang=fr
LISEZ CES CONDITIONS ATTENTIVEMENT AVANT D’UTILISER CE SITE WEB.
READ THESE TERMS AND CONDITIONS CAREFULLY BEFORE USING THIS WEBSITE.
Access and use of this website and the material on it are subject to the Terms and Conditions set forth at
http://nparc.cisti-icist.nrc-cnrc.gc.ca/npsi/jsp/nparc_cp.jsp?lang=en
Page 2
hidden
Using Classification Methods to Label Tasks in Process Mining
Scott Buffett and Liqiang Geng
Institute for Information Technology - e-Business, National Research Council,
Fredericton, New Brunswick, Canada, E3B 9W4
{scott.buffett, liqiang.geng}@nrc.gc.ca
Abstract. We investigate a method designed to improve the accuracy of process mining in scenarios
where the identification of task labels for log events is uncertain. Such situations are prevalent in
business processes where events consist of communications between people, such as email messages.
We examine how the accuracy of an independent task identifier, such as a classification or clustering
engine, can be improved by examining the currently mined process model. First, a classification
scheme based on identifying keywords in each message is presented to provide an initial labeling.
We then demonstrate how these labels can be refined by considering the likelihood that the event
represents a particular task as obtained via an analysis of the current representation of the process
model. This process is then repeated a number of times until the model is sufficiently refined. Results
show that both keyword classification and current process model analysis can be significantly effective
on their own, and when combined have the potential to correct virtually all errors when noise is low
(less than 20%), and can reduce the error rate by about 85% when noise is in the 30-40% range.
Keywords: workflow, process mining, task labeling, Bayesian classification
1 Introduction
In recent years, research in business process management has seen a considerable effort in the field of process
mining. Process mining involves automatically (or semi-automatically) inspecting a log of machine-level

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

3 Readers on Mendeley
by Discipline
 
by Academic Status
 
67% Ph.D. Student
 
33% Student (Master)
by Country
 
33% Germany
 
33% Chile
 
33% Canada