TourismNLG: A Multi-lingual Generative Benchmark for the Tourism Domain

Sahil Manoj Bhatt; Sahaj Agarwal; Omkar Gurjar; Manish Gupta; Manish Shrivastava

Conference Proceedings

TourismNLG: A Multi-lingual Generative Benchmark for the Tourism Domain

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2023) 13980 LNCS 150-166

DOI: 10.1007/978-3-031-28244-7_10

0Citations

3Readers

Get full text

Abstract

The tourism industry is important for the benefits it brings and due to its role as a commercial activity that creates demand and growth for many more industries. Yet there is not much work on data science problems in tourism. Unfortunately, there is not even a standard benchmark for evaluation of tourism-specific data science tasks and models. In this paper, we propose a benchmark, TourismNLG, of five natural language generation (NLG) tasks for the tourism domain and release corresponding datasets with standard train, validation and test splits. Further, previously proposed data science solutions for tourism problems do not leverage the recent benefits of transfer learning. Hence, we also contribute the first rigorously pretrained mT5 and mBART model checkpoints for the tourism domain. The models have been pretrained on four tourism-specific datasets covering different aspects of tourism. Using these models, we present initial baseline results on the benchmark tasks. We hope that the dataset will promote active research for natural language generation for travel and tourism. (https://drive.google.com/file/d/1tux19cLoXc1gz9Jwj9VebXmoRvF9MF6B/.)

Author supplied keywords

Cite

CITATION STYLE

APA

Bhatt, S. M., Agarwal, S., Gurjar, O., Gupta, M., & Shrivastava, M. (2023). TourismNLG: A Multi-lingual Generative Benchmark for the Tourism Domain. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13980 LNCS, pp. 150–166). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-28244-7_10

TourismNLG: A Multi-lingual Generative Benchmark for the Tourism Domain

Abstract

Author supplied keywords

Cite

Register to see more suggestions