Nowadays the importance of data collection, processing, and analyzing is growing tremendously. Big Data technologies are in high demand in different areas, including bio-informatics, hydrometeorology, high energy physics, etc. One of the most popular computation paradigms that is used in large data processing frameworks is the MapReduce programming model. Today integrated optimization mechanisms that take into account only load balance and execution fast simplicity are not enough for advanced computations and more efficient complex approaches are needed. In this paper, we suggest an improved algorithm based on categorization for data reorganization in MapReduce frameworks using replication and network aspects. Moreover, for urgent computations that require a specific approach, the prioritization customization is introduced.
Spivak, A., Razumovskiy, A., Myagkov, A., & Nasonov, D. (2015). Evolutionary replicative data reorganization with prioritization for efficient workload processing. In Procedia Computer Science (Vol. 51, pp. 2357–2366). Elsevier B.V. https://doi.org/10.1016/j.procs.2015.05.405