Solving Hungarian natural language processing tasks with multilingual generative models

Zijian Győző Yang; László János Laki

Journal ArticleOPEN ACCESS

Solving Hungarian natural language processing tasks with multilingual generative models

Annales Mathematicae et Informaticae (2023) 57 92-106

DOI: 10.33039/ami.2022.11.001

0Citations

28Readers

Abstract

Generative ability is a crucial need for artificial intelligence appli-cations, such as chatbots, virtual assistants, machine translation systems etc. In recent years, the transformer-based neural architectures gave a huge boost to generate human-like English texts. In our research we did experiments to create pre-trained generative transformer models for Hungarian language and fine-tune them for multiple types of natural language processing tasks. In our focus, multilingual models were trained. We have pre-trained a multilingual BART, then fine-tuned it to various NLP tasks, such as text classification, abstractive summarization. In our experiments, we focused on transfer learning techniques to increase the performance. Furthermore, a M2M100 multilingual model was fine-tuned for a 12-lingual Hungarian-Centric machine translation. Last but not least, a Marian NMT based machine translation system was also built from scratch for the 12-lingual Hungarian-Centric machine translation task. In our results, using the cross-lingual transfer method we could achieve higher performance in all of our tasks. In our machine translation experi-ment, using our fine-tuned M2M100 model we could outperform the Google Translate, Microsoft Translator and eTranslation.

Author supplied keywords

Cite

CITATION STYLE

APA

Yang, Z. G., & Laki, L. J. (2023). Solving Hungarian natural language processing tasks with multilingual generative models. Annales Mathematicae et Informaticae, 57, 92–106. https://doi.org/10.33039/ami.2022.11.001

Solving Hungarian natural language processing tasks with multilingual generative models

Abstract

Author supplied keywords

Cite

Register to see more suggestions