Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges

Georgios L. Stavrinides; Helen D. Karatza

Book Chapter

Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges

Springer Science and Business Media Deutschland GmbH, (2018), 19-43

DOI: 10.1007/978-3-319-73767-6_2

22Citations

10Readers

Get full text

Abstract

With the explosive growth of big data, workloads tend to get more complex and computationally demanding. Such applications are processed on distributed interconnected resources that are becoming larger in scale and computational capacity. Data-intensive applications may have different degrees of parallelism and must effectively exploit data locality. Furthermore, they may impose several Quality of Service requirements, such as time constraints and resilience against failures, as well as other objectives, like energy efficiency. These features of the workloads, as well as the inherent characteristics of the computing resources required to process them, present major challenges that require the employment of effective scheduling techniques. In this chapter, a classification of data-intensive workloads is proposed and an overview of the most commonly used approaches for their scheduling in large-scale distributed systems is given. We present novel strategies that have been proposed in the literature and shed light on open challenges and future directions.

Author supplied keywords

Cite

CITATION STYLE

APA

Stavrinides, G. L., & Karatza, H. D. (2018). Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges. In Studies in Big Data (Vol. 36, pp. 19–43). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-319-73767-6_2

Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges

Abstract

Author supplied keywords

Cite

Register to see more suggestions