This paper critically examines the current practices of benchmark dataset sharing in NLP and suggests a better way to inform reusers of the benchmark dataset. As the dataset sharing platform plays a key role not only in distributing the dataset but also in informing the potential reusers about the dataset, we believe data sharing platforms should provide a comprehensive context of the datasets. We survey four benchmark dataset sharing platforms: HuggingFace, PaperswithCode, Tensorflow, and Pytorch to diagnose the current practices of how the dataset is shared - which metadata is shared and omitted. To be specific, drawing on the concept of data curation which considers the future reuse when the data is made public, we advance the direction that benchmark dataset sharing platforms should take into consideration. We identify that four benchmark platforms have different practices of using metadata and there is a lack of consensus on what social impact metadata is. We believe the problem of missing a discussion around social impact in the dataset sharing platforms has to do with the failed agreement on who should be in charge. We propose that the benchmark dataset should develop social impact metadata and data curator should take a role in managing the social impact metadata.
CITATION STYLE
Park, J., & Jeoung, S. (2022). Raison d’être of the benchmark dataset: A Survey of Current Practices of Benchmark Dataset Sharing Platforms. In NLP-Power 2022 - 1st Workshop on Efficient Benchmarking in NLP, Proceedings of the Workshop (pp. 1–10). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.nlppower-1.1
Mendeley helps you to discover research relevant for your work.