Abstract
The reliability of hard drives is paramount for maintaining data integrity and availability in cloud services and enterprise-level data centers where unexpected failures significantly impact operational efficiency and general performance. This work aims to develop a predictive model using regression analysis to accurately forecast imminent hard drive failures based on historical operational data, specifically SMART (Self-Monitoring Analysis and Reporting Technology) attributes. The study evaluated various regression models which comprises Decision Tree, Random Forest, Support Vector Machine (SVM), Gradient Boosting, and Neural Network. The outcomes indicated that the Random Forest model, with an MSE of 24.7427 and an R2 of 0.9876 and the Neural Network model, with an MSE of 22.6011 and an R2of 0.7442, as the best performing models as they demonstrated high predictive accuracy and robustness. In contrast, the SVM model showed poor performance with an MSE of 2888.8623 and a negative R2of -0.4465. Based on these outcomes, the Random Forest and Neural Network models are recommended for predicting hard drive failures as they delivered a balance of accuracy and interpretability.
Cite
CITATION STYLE
Atekoja, E. (2024). Prediction of Hard Drive Failure using Machine Learning. Global Journal of Computer Science and Technology, 41–54. https://doi.org/10.34257/gjcstdvol24is1pg41
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.