Automated Machine Learning with Genetic Programming on Real Dataset of Tax Avoidance Classification Problem

3Citations
Citations of this article
40Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Dealing with real application datasets often derive a stumbling block for machine learning algorithms to produce good results in solving either prediction or classification problems. Imbalance dataset is the major reason for this problem associated with missing values, small dimension of data size and very skewed data distribution. This paper demonstrates an empirical study that used Automated Machine Learning (AML) based on Genetic Programming (GP) named as AML TPOT. This is a very recent AML developed as an open source Python library and reported as a promising model by a few of researchers who have tested the algorithm. Nevertheless, most of the works on the AML TPOT were conducted on a set of common or benchmark datasets for machine learning testing. In this paper, the focus is on real and deviant dataset, which were collected according to the tax avoidance of the Government-Link Company in Malaysia. Comparison of the AML performances that tested on the dataset with different GP parameters setting is provided. Thus, this paper provides a fundamental knowledge on the experimental design and finding that will be useful for the AML based GP future improvement.

Cite

CITATION STYLE

APA

Masrom, S., Rahman, R. A., Baharun, N., & Rahman, A. S. A. (2020). Automated Machine Learning with Genetic Programming on Real Dataset of Tax Avoidance Classification Problem. In ACM International Conference Proceeding Series (pp. 139–143). Association for Computing Machinery. https://doi.org/10.1145/3383923.3383942

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free