Categories for (Big) Data models and optimization

Laurent Thiry; Heng Zhao; Michel Hassenforder

Journal ArticleOPEN ACCESS

Categories for (Big) Data models and optimization

Journal of Big Data (2018) 5(1)

DOI: 10.1186/s40537-018-0132-9

10Citations

29Readers

Abstract

This paper proposes a theoretical foundation for Big Data. More precisely, it explains how “functors”, a concept coming from Category Theory, can serve to model the various data structures commonly used to represent (large) data sets, and how “natural transformations” can formalize relations between these structures. Algorithms, such as querying a precise information, mainly depend on the data structure considered, and thus natural transformations can serve to optimize these algorithms and get a result in a shorter time. The paper details four functors modeling tabular data, graph structures (e.g. triple stores), cached and split data. Next, the paper explains how, by considering a functional programming language, the concepts can be implemented without effort to propose new tools (e.g. efficient information servers and query languages). And, as a complement to the mathematical models proposed, the paper also presents a optimized data server and a specific query language (based on “unification” to facilitates the search of information). Finally, the paper gives a comparison study and shows that this tool is more efficient than most of the standards available in the market: the functional server appears to be 10+ times faster than relational or document oriented databases (Mysql and MongoDB), and 100+ times faster than a graph database (Neo4j).

Author supplied keywords

Cite

CITATION STYLE

APA

Thiry, L., Zhao, H., & Hassenforder, M. (2018). Categories for (Big) Data models and optimization. Journal of Big Data, 5(1). https://doi.org/10.1186/s40537-018-0132-9

Categories for (Big) Data models and optimization

Abstract

Author supplied keywords

Cite

Register to see more suggestions