A data structure to speed-up machine learning algorithms on massive datasets

9Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data processing in a fast and efficient way is an important functionality in machine learning, especially with the growing interest in data storage. This exponential increment in data size has hampered traditional techniques for data analysis and data processing, giving rise to a new set of methodologies under the term Big Data. Many efficient algorithms for machine learning have been proposed, facing up time and main memory requirements. Nevertheless, this process could still become hard when the number of features or records is extremely high. In this paper, the goal is not to propose new efficient algorithms but a new data structure that could be used by a variety of existing algorithms without modifying their original schemata. Moreover, the proposed data structure enables sparse datasets to be massively reduced, efficiently processing the data input into a new data structure output. The results demonstrate that the proposed data structure is highly promising, reducing the amount of storage and improving query performance.

Cite

CITATION STYLE

APA

Padillo, F., Luna, J. M., Cano, A., & Ventura, S. (2016). A data structure to speed-up machine learning algorithms on massive datasets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9648, pp. 365–376). Springer Verlag. https://doi.org/10.1007/978-3-319-32034-2_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free