Semi-automated augmentation of pandas dataframes

2Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Creative feature engineering is an important aspect within machine learning prediction tasks which can be facilitated by augmenting datasets with additional data to improve predictions. This paper presents an approach towards augmenting existing datasets represented as pandas dataframes with data from open data sources, semi-automatically, with the aims of (1) automatically suggesting data augmentation options given an existing set of features, and (2) automatically augmenting the data when a suggestion is selected by the user. This paper demonstrates the performance of the approach in terms of aligning typical machine learning datasets with open data sources, suggesting useful augmentation options, and the design and implementation of a software tool implementing the approach, available as open-source software.

Author supplied keywords

Cite

CITATION STYLE

APA

Lynden, S., & Taveekarn, W. (2019). Semi-automated augmentation of pandas dataframes. In Communications in Computer and Information Science (Vol. 1071, pp. 70–79). Springer Verlag. https://doi.org/10.1007/978-981-32-9563-6_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free