Abstract
Claim prediction is one of the important elements in the insurance. The increasing frequency of claim makes the data volume also increases to become big data. So, we need the right machine learning method to help insurance companies manage big data more efficiently. XGBoost is a machine learning model based on decision trees. XGBoost can be applied for claim prediction case in the form of two-class or multi-class classification. We may select a subset of features in building the XGBoost model especially for data with a large number of features. In this paper, we examine the influence of the proportion of features on the accuracy of the XGBoost model. Our simulations show that by randomly using 1/5 of features, the XGBoost model can produce accuracy comparable to the model that uses all features. It means that the XGBoost model is scalable in terms of the proportion of features.
Cite
CITATION STYLE
Khusna, W., & Murfí, H. (2020). An analysis of the proportion of feature subsampling on XG boost - A case study of claim prediction in car insurance. In AIP Conference Proceedings (Vol. 2296). American Institute of Physics Inc. https://doi.org/10.1063/5.0031366
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.