PMCoders at SemEval-2023 Task 1: RAltCLIP: Use Relative AltCLIP Features to Rank

2Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Visual Word Sense Disambiguation (VWSD) task aims to find the most related image among 10 images to an ambiguous word in some limited textual context. In this work, we use AltCLIP features and a 3-layer standard transformer encoder to compare the cosine similarity between the given phrase and different images. Also, we improve our model’s generalization by using a subset of LAION-5B. The best official baseline achieves 37.20% and 54.39% macro-averaged hit rate and MRR (Mean Reciprocal Rank) respectively. Our best configuration reaches 39.61% and 56.78% macro-averaged hit rate and MRR respectively. The code will be made publicly available on GitHub.

Cite

CITATION STYLE

APA

Pirhadi, M. J., Mirzaei, M., Mohammadi, M. R., & Eetemadi, S. (2023). PMCoders at SemEval-2023 Task 1: RAltCLIP: Use Relative AltCLIP Features to Rank. In 17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop (pp. 1751–1755). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.semeval-1.242

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free