Research on the Construction and Realization of Data Pipeline in Machine Learning Regression Prediction

Hua Zhang; Guoxun Zheng; Jun Xu; Xuekun Yao

Journal ArticleOPEN ACCESS

Research on the Construction and Realization of Data Pipeline in Machine Learning Regression Prediction

Mathematical Problems in Engineering (2022) 2022

DOI: 10.1155/2022/7924335

8Citations

17Readers

Abstract

The data set used by machine learning usually contains missing value and text type data, and sometimes, it is necessary to combine the attributes in the data set. The data set must be cleaned and converted before the machine learning model can be generated. This is frequently a chain of events. The entire processing procedure will be time-consuming and inconvenient. This article examines the data pipeline and recommends that it be used to process all data. We carry out automation and use k-fold cross-validation to evaluate the performance of the model. Experiments demonstrate that it can lower the regression prediction model's root mean square error and enhance prediction accuracy.

Cite

CITATION STYLE

APA

Zhang, H., Zheng, G., Xu, J., & Yao, X. (2022). Research on the Construction and Realization of Data Pipeline in Machine Learning Regression Prediction. Mathematical Problems in Engineering, 2022. https://doi.org/10.1155/2022/7924335

Research on the Construction and Realization of Data Pipeline in Machine Learning Regression Prediction

Abstract

Cite

Register to see more suggestions