AutoML Pipeline Selection: Efficiently Navigating the Combinatorial Space

Chengrun Yang; Jicong Fan; Ziyang Wu; Madeleine Udell

Conference ProceedingsOPEN ACCESS

AutoML Pipeline Selection: Efficiently Navigating the Combinatorial Space

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2020) 1446-1456

DOI: 10.1145/3394486.3403197

23Citations

57Readers

Abstract

Data scientists seeking a good supervised learning model on a dataset have many choices to make: they must preprocess the data, select features, possibly reduce the dimension, select an estimation algorithm, and choose hyperparameters for each of these pipeline components. With new pipeline components comes a combinatorial explosion in the number of choices In this work, we design a new AutoML system TensorOboe to address this challenge: an automated system to design a supervised learning pipeline. TensorOboe uses low rank tensor decomposition as a surrogate model for efficient pipeline search. We also develop a new greedy experiment design protocol to gather information about a new dataset efficiently. Experiments on large corpora of real-world classification problems demonstrate the effectiveness of our approach.

Author supplied keywords

Cite

CITATION STYLE

APA

Yang, C., Fan, J., Wu, Z., & Udell, M. (2020). AutoML Pipeline Selection: Efficiently Navigating the Combinatorial Space. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1446–1456). Association for Computing Machinery. https://doi.org/10.1145/3394486.3403197

AutoML Pipeline Selection: Efficiently Navigating the Combinatorial Space

Abstract

Author supplied keywords

Cite

Register to see more suggestions