Research on the Construction and Realization of Data Pipeline in Machine Learning Regression Prediction

8Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The data set used by machine learning usually contains missing value and text type data, and sometimes, it is necessary to combine the attributes in the data set. The data set must be cleaned and converted before the machine learning model can be generated. This is frequently a chain of events. The entire processing procedure will be time-consuming and inconvenient. This article examines the data pipeline and recommends that it be used to process all data. We carry out automation and use k-fold cross-validation to evaluate the performance of the model. Experiments demonstrate that it can lower the regression prediction model's root mean square error and enhance prediction accuracy.

Cite

CITATION STYLE

APA

Zhang, H., Zheng, G., Xu, J., & Yao, X. (2022). Research on the Construction and Realization of Data Pipeline in Machine Learning Regression Prediction. Mathematical Problems in Engineering, 2022. https://doi.org/10.1155/2022/7924335

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free