Lavoisier: High-Level Selection and Preparation of Data for Analysis

1Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Most data mining algorithms require their input data to be provided in a very specific tabular format. Data scientists typically achieve this task by creating long and complex scripts, written in data management languages such as SQL, R or Pandas, where different low-level data transformation operations are performed. The process of writing these scripts can be really time-consuming and error-prone, which decreases data scientists’ productivity. To overcome this limitation, we present Lavoisier, a declarative language for data extraction and formatting. This language provides a set of high-level constructs that allow data scientists to abstract from low-level data formatting operations. Consequently, data extraction scripts’ size and complexity are reduced, contributing to an increase of the productivity with respect to using conventional data manipulation tools.

Cite

CITATION STYLE

APA

de la Vega, A., García-Saiz, D., Zorrilla, M., & Sánchez, P. (2019). Lavoisier: High-Level Selection and Preparation of Data for Analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11815 LNCS, pp. 50–66). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-32065-2_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free