We present our contribution to the SemEval 22 Shared Task 8: Multilingual news article similarity. The approach is lightweight and language-agnostic, it is based on the computation of several lexicographic and embedding-based features, and the use of a simple ML approach: random forests. In a notable departure from the task formulation, which is a ranking task, we tackled this task as a classification one. We present a detailed analysis of the behaviour of our system under different settings.
CITATION STYLE
Stefanovitch, N. (2022). Team TMA at SemEval-2022 Task 8: Lightweight and Language-Agnostic News Similarity Classifier. In SemEval 2022 - 16th International Workshop on Semantic Evaluation, Proceedings of the Workshop (pp. 1178–1183). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.semeval-1.166
Mendeley helps you to discover research relevant for your work.