TourismNLG: A Multi-lingual Generative Benchmark for the Tourism Domain

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The tourism industry is important for the benefits it brings and due to its role as a commercial activity that creates demand and growth for many more industries. Yet there is not much work on data science problems in tourism. Unfortunately, there is not even a standard benchmark for evaluation of tourism-specific data science tasks and models. In this paper, we propose a benchmark, TourismNLG, of five natural language generation (NLG) tasks for the tourism domain and release corresponding datasets with standard train, validation and test splits. Further, previously proposed data science solutions for tourism problems do not leverage the recent benefits of transfer learning. Hence, we also contribute the first rigorously pretrained mT5 and mBART model checkpoints for the tourism domain. The models have been pretrained on four tourism-specific datasets covering different aspects of tourism. Using these models, we present initial baseline results on the benchmark tasks. We hope that the dataset will promote active research for natural language generation for travel and tourism. (https://drive.google.com/file/d/1tux19cLoXc1gz9Jwj9VebXmoRvF9MF6B/.)

Cite

CITATION STYLE

APA

Bhatt, S. M., Agarwal, S., Gurjar, O., Gupta, M., & Shrivastava, M. (2023). TourismNLG: A Multi-lingual Generative Benchmark for the Tourism Domain. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13980 LNCS, pp. 150–166). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-28244-7_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free