Abstract
Multiword Terms (MWTs) are domain-specific Multiword Expressions (MWE) (Pajić et al., 2018) where two or more lexemes converge to form a new unit of meaning (León Araúz and Cabezas García, 2020). The task of processing MWTs is crucial in many Natural Language Processing (NLP) applications, including Machine Translation (MT) and terminology extraction. However, the automatic detection of those terms is a difficult task and more research is still required to give more insightful and useful results in this field. In this study, we seek to fill this gap by using state-of-the-art transformer models. We evaluate both BERT (Devlin et al., 2019) like discriminative transformer models and generative pre-trained transformer (GPT) (Radford et al., 2018) models on this task, and we show that discriminative models perform better than current GPT models in the identification of multiword flower and plant names for both English and Spanish. Best discriminative models perform with 94.3, 82.1 F1 scores in English and Spanish data, respectively, while ChatGPT could only return 63.3 and 47.7 F1 scores, respectively.
Cite
CITATION STYLE
Premasiri, D., Haddad, A. H., Ranasinghe, T., & Mitkov, R. (2023). Deep Learning Methods for Identification of Multiword Flower and Plant Names. In International Conference Recent Advances in Natural Language Processing, RANLP (pp. 879–887). Incoma Ltd. https://doi.org/10.26615/978-954-452-092-2_095
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.