Categories for (Big) Data models and optimization

10Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This paper proposes a theoretical foundation for Big Data. More precisely, it explains how “functors”, a concept coming from Category Theory, can serve to model the various data structures commonly used to represent (large) data sets, and how “natural transformations” can formalize relations between these structures. Algorithms, such as querying a precise information, mainly depend on the data structure considered, and thus natural transformations can serve to optimize these algorithms and get a result in a shorter time. The paper details four functors modeling tabular data, graph structures (e.g. triple stores), cached and split data. Next, the paper explains how, by considering a functional programming language, the concepts can be implemented without effort to propose new tools (e.g. efficient information servers and query languages). And, as a complement to the mathematical models proposed, the paper also presents a optimized data server and a specific query language (based on “unification” to facilitates the search of information). Finally, the paper gives a comparison study and shows that this tool is more efficient than most of the standards available in the market: the functional server appears to be 10+ times faster than relational or document oriented databases (Mysql and MongoDB), and 100+ times faster than a graph database (Neo4j).

Cite

CITATION STYLE

APA

Thiry, L., Zhao, H., & Hassenforder, M. (2018). Categories for (Big) Data models and optimization. Journal of Big Data, 5(1). https://doi.org/10.1186/s40537-018-0132-9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free