Disease diagnosis is an arduous task due to the high cost associated with the misclassifications. The classifiers for disease diagnosis must be accurate, offer a tradeoff among sensitivity and specificity, and address the problem of class imbalance. Further, the ad-hoc hyperparameter settings deteriorate classifiers performance, specifically for disease diagnosis. For disease diagnosis systems, it is essential to address the issues of the setting of hyperparameters and class imbalance simultaneously. The objective of this paper is to propose a Multi-objective Hyperparameter Tuning approach, called MOHPT, in combination with resampling techniques. So far, the hyperparameters of classifiers for disease diagnosis have been often optimized based on a single objective criterion, i.e., accuracy. Multi-objective hyperparameter tuning in combination with resampling techniques is expected to enhance the efficacy of disease diagnosis. We have used decision trees (CART) and random forest as the classifiers. The proposed system MOHPT obtains non-dominated optimal hyperparameter configurations of these classifiers using crowding based sorting technique from multi-objective NSGA-II. The MOHPT uses sensitivity (true positive rate) and specificity (true negative rate) as the two objective functions for optimization, and each configuration in the non-dominated front of hyperparameter configurations belongs to a different tradeoff between sensitivity and specificity. The suggested approach offers a choice to medical practitioners to prefer any of the hyperparameter configurations. However, we have selected the hyperparameter configuration based on the optimum g-mean, a product of sensitivity and specificity. The resampling techniques such as undersampling, oversampling, SMOTE, RWO, MWSMOTE, and ROSE have been implemented to tackle the class imbalance. The MOHPT is tested on 17 medical datasets. Several combinations of decision tree and random forest classifiers have been implemented in combination with hyperparameter tuning and resampling techniques. The results have been validated using appropriate statistical tests. The result shows that the application of hyperparameter tuning in combination with sampling techniques significantly enhances the performance of disease diagnosis. Overall, the performance of the random forest is superior to the other classifiers. The MOHPT comparison with the related works further proves its efficacy. The suggested technique has been successful in finding the set of non-dominated set of solutions for hyperparameter configurations along with addressing the class imbalance problem. The results show a performance improvement on several performance evaluation metrics such as sensitivity, specificity, AUC etc.
CITATION STYLE
Kumar, S., & Ratnoo, S. (2021). Multi-objective hyperparameter tuning of classifiers for disease diagnosis. Indian Journal of Computer Science and Engineering, 12(5), 1334–1352. https://doi.org/10.21817/INDJCSE/2021/V12I5/211205081
Mendeley helps you to discover research relevant for your work.