Predicting the final outcome of an ongoing process instance is a key problem in many real-life contexts. This problem has been addressed mainly by discovering a prediction model by using traditional machine learning methods and, more recently, deep learning methods, exploiting the supervision coming from outcome-class labels associated with historical log traces. However, a supervised learning strategy is unsuitable for important application scenarios where the outcome labels are known only for a small fraction of log traces. In order to address these challenging scenarios, a semi-supervised learning approach is proposed here, which leverages a multi-target DNN model supporting both outcome prediction and the additional auxiliary task of next-activity prediction. The latter task helps the DNN model avoid spurious trace embeddings and overfitting behaviors. In extensive experimentation, this approach is shown to outperform both fully-supervised and semi-supervised discovery methods using similar DNN architectures across different real-life datasets and label-scarce settings.
CITATION STYLE
Folino, F., Folino, G., Guarascio, M., & Pontieri, L. (2022). Semi-Supervised Discovery of DNN-Based Outcome Predictors from Scarcely-Labeled Process Logs. Business and Information Systems Engineering, 64(6), 729–749. https://doi.org/10.1007/s12599-022-00749-9
Mendeley helps you to discover research relevant for your work.