To help automate the important pre-processing step in machine learning and data mining, we introduce synth-a-sizer, a tool for semi-automatically wrangling spreadsheets into attribute-value format, so that they can be used by popular machine learning tools, only requiring the user to mark cells belonging to one single example. synth-a-sizer is based on inductive programming principles. We introduce synth-a-sizer’s transformations, search algorithm as well as a heuristic and distance measure for identifying types. We also report on a first experimental evaluation.
CITATION STYLE
Verbruggen, G., & De Raedt, L. (2018). Automatically wrangling spreadsheets into machine learning data formats. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11191 LNCS, pp. 367–379). Springer Verlag. https://doi.org/10.1007/978-3-030-01768-2_30
Mendeley helps you to discover research relevant for your work.