While automatic term extraction is a well-researched area, computational approaches to distinguish between degrees of technicality are still understudied. We semi-automatically create a German gold standard of technicality across four domains, and illustrate the impact of a web-crawled general-language corpus on predicting technicality. When defining a classification approach that combines general-language and domain-specific word embeddings, we go beyond previous work and align vector spaces to gain comparative embeddings. We suggest two novel models to exploit general- vs. domain-specific comparisons: a simple neural network model with pre-computed comparative-embedding information as input, and a multi-channel model computing the comparison internally. Both models outperform previous approaches, with the multi-channel model performing best.
CITATION STYLE
Hätty, A., Schlechtweg, D., Dorna, M., & Im Walde, S. S. (2020). Predicting degrees of technicality in automatic terminology extraction. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 2883–2889). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.258
Mendeley helps you to discover research relevant for your work.