Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization

Fabrizio Carcillo; Yann Aël Le Borgne; Olivier Caelen; Gianluca Bontempi

Journal Article

Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization

International Journal of Data Science and Analytics (2018) 5(4) 285-300

DOI: 10.1007/s41060-018-0116-z

65Citations

101Readers

Get full text

Abstract

Credit card fraud detection is a very challenging problem because of the specific nature of transaction data and the labeling process. The transaction data are peculiar because they are obtained in a streaming fashion, and they are strongly imbalanced and prone to non-stationarity. The labeling is the outcome of an active learning process, as every day human investigators contact only a small number of cardholders (associated with the riskiest transactions) and obtain the class (fraud or genuine) of the related transactions. An adequate selection of the set of cardholders is therefore crucial for an efficient fraud detection process. In this paper, we present a number of active learning strategies and we investigate their fraud detection accuracies. We compare different criteria (supervised, semi-supervised and unsupervised) to query unlabeled transactions. Finally, we highlight the existence of an exploitation/exploration trade-off for active learning in the context of fraud detection, which has so far been overlooked in the literature.

Author supplied keywords

Cite

CITATION STYLE

APA

Carcillo, F., Le Borgne, Y. A., Caelen, O., & Bontempi, G. (2018). Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization. International Journal of Data Science and Analytics, 5(4), 285–300. https://doi.org/10.1007/s41060-018-0116-z

Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization

Abstract

Author supplied keywords

Cite

Register to see more suggestions