Abstract
Writing job vacancies can be a repetitive and expensive task for humans. This research focuses on automatically generating parts of vacancy texts, i.e., the benefits section, given structured job attributes as input using mT5, the multilingual version of the state-of-the-art T5 transformer model. While transformers are accurate at generating coherent text, they can struggle with correctly including structured (input) data in the generated text. Including this input data correctly is crucial for vacancy text generation; otherwise, job seekers may be misled. To evaluate how the model includes the different types of structured input, we propose a novel domain-specific metric: 'input generation accuracy'. Our metric aims to address the shortcomings of Relation Generation, a commonly used evaluation metric for data-to-text generation that relies on string matching, as our task includes evaluating generated texts based on binary and categorical inputs. Using our novel evaluation method, we measure how well the input is included in the generated text separately for different types of inputs (binary, categorical, numeric), offering another contribution to the field. In addition, we evaluate how accurately the mT5 model generates texts in the requested languages. Our experiments show that mT5 is highly accurate at generating texts in the correct (requested) languages, and at handling seen categorical and binary inputs correctly. However, mT5 performed worse when generating text from unseen city names or working with numeric inputs.
Cite
CITATION STYLE
Lorincz, A., Lavi, D., Graus, D., & Pereira, J. L. M. (2022). Transfer learning for multilingual vacancy text generation. In GEM 2022 - 2nd Workshop on Natural Language Generation, Evaluation, and Metrics, Proceedings of the Workshop (pp. 207–222). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.gem-1.18
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.