Harmonizing Sequential and Random Access to Datasets in Organizationally Distributed Environments

Michał Wrzeszcz; Łukasz Opioła; Bartosz Kryza; Łukasz Dutka; Renata G. Słota; Jacek Kitowski

Conference ProceedingsOPEN ACCESS

Harmonizing Sequential and Random Access to Datasets in Organizationally Distributed Environments

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11536 LNCS 295-308

DOI: 10.1007/978-3-030-22734-0_22

1Citations

3Readers

Abstract

Computational science is rapidly developing, which pushes the boundaries in data management concerning the size and structure of datasets, data processing patterns, geographical distribution of data and performance expectations. In this paper we present a solution for harmonizing data access performance, i.e. finding a compromise between local and remote read/write efficiency that would fit those evolving requirements. It is based on variable-size logical data-chunks (in contrast to fixed-size blocks), direct storage access and several mechanisms improving remote data access performance. The solution is implemented in the Onedata system and suited to its multi-layer architecture, supporting organizationally distributed environments – with limited trust between data providers. The solution is benchmarked and compared to XRootD + XCache, which offers similar functionalities. The results show that the performance of both systems is comparable, although overheads in local data access are visibly lower in Onedata.

Author supplied keywords

Cite

CITATION STYLE

APA

Wrzeszcz, M., Opioła, Ł., Kryza, B., Dutka, Ł., Słota, R. G., & Kitowski, J. (2019). Harmonizing Sequential and Random Access to Datasets in Organizationally Distributed Environments. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11536 LNCS, pp. 295–308). Springer Verlag. https://doi.org/10.1007/978-3-030-22734-0_22

Harmonizing Sequential and Random Access to Datasets in Organizationally Distributed Environments

Abstract

Author supplied keywords

Cite

Register to see more suggestions