Equally partitioned data are essential for prediction. However, in some important cases, the data distribution is severely unbalanced. In this study, several algorithms are utilized to maximize the learning accuracy when dealing with a highly unbalanced dataset. A linguistic algorithm is applied to evaluate the input and output relationship, namely Fuzzy c-Means (FCM), which is applied as a clustering algorithm for the majority class to balance the minority class data from about 3 million cases. Each cluster is used to train several artificial neural network (ANN) models. Different techniques are applied to generate an ensemble genetic fuzzy neuro model (EGFNM) in order to select the models. The first ensemble technique, the intra-cluster EGFNM, works by evaluating the best combination from all the models generated by each cluster. Another ensemble technique is the inter-cluster model EGFNM, which is based on selecting the best model from each cluster. The accuracy of these techniques is evaluated using the receiver operating characteristic (ROC) via its area under the curve (AUC). Results show that the AUC of the unbalanced data is 0.67974. The random cluster and best ANN single model have AUCs of 0.7177 and 0.72806, respectively. For the ensemble evaluations, the intra-cluster and the inter-cluster EGFNMs produce 0.7293 and 0.73038, respectively. In conclusion, this study achieved improved results by performing the EGFNM method compared with the unbalanced training. This study concludes that selecting several best models will produce a better result compared with all models combined.
CITATION STYLE
Sadrawi, M., Sun, W. Z., Ma, M. H. M., Yeh, Y. T., Abbod, M. F., & Shieh, J. S. (2018). Ensemble genetic fuzzy neuro model applied for the emergency medical service via unbalanced data evaluation. Symmetry, 10(3). https://doi.org/10.3390/SYM10030071
Mendeley helps you to discover research relevant for your work.