Ascertaining the impact of balancing the flood dataset on the performance of classification based flood forecasting models for the northern districts of Bihar

0Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Bihar is the most flood-affected state in India and the losses incurred amount to one-third of the total losses due to floods in India. These losses can be alleviated by designing models that forecast floods in real time. One such model exists that uses classification based machine learning techniques to forecast floods in northern district of Bihar. However, the flood dataset used was imbalanced, as the non-flooding instances far exceeded the flooding instances. This paper attempts to address this problem by balancing this data using oversampling techniques and thereafter use it for designing flood forecasting models. The objective of the paper is to ascertain whether balancing dataset improves the performance of classifiers. Experimental based comparison showed that the classifiers performed comparatively better on balanced dataset in terms of accuracy, precision, recall, F-measure and AUC-ROC. Further, dataset balanced using KMeans SMOTE resulted in the maximum improvement in the performance of all classifiers.

Cite

CITATION STYLE

APA

Mittal, V., Kumar, T. V. V., & Goel, A. (2022). Ascertaining the impact of balancing the flood dataset on the performance of classification based flood forecasting models for the northern districts of Bihar. International Journal of Water, 15(2), 75–100. https://doi.org/10.1504/IJW.2022.132287

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free