Stephen Colbert at SemEval-2023 Task 5: Using Markup for Classifying Clickbait

1Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

For SemEval-2023 Task 5, we have submitted three DeBERTaV3LARGE models to tackle the first subtask, classifying spoiler types (passage, phrase, multi) of clickbait web articles. The choice of basic parameters like sequence length with BERTBASE uncased and further approaches were then tested with DeBERTaV3BASE only moving the most promising ones to DeBERTaV3LARGE. Our research showed that information-placement on webpages is often optimized regarding e.g. ad-placement. Those informations are usually described within the webpages markup which is why we conducted an approach that takes this into account. Overall we could not manage to beat the baseline, which we lead down to three reasons: First we only crawled markup for Huffington Post articles, extracting only - and -tags which will not cover enough aspects of a webpages design. Second Huffington Post articles are overrepresented in the given dataset, which, third, shows an imbalance towards the spoiler tags. We highly suggest re-annotating the given dataset to use markup-optimized models like MarkupLM or TIE and to clear it from embedded articles like "Yahoo" or archives like "archive.is" or "web.archive" to avoid noise. Also, the imbalance should be tackled by adding articles from sources other than Huffington Post, considering that also multi-tagged entries should be balanced towards passage- and phrase-tagged ones.

Cite

CITATION STYLE

APA

Spreitzer, S., & Tran, H. N. (2023). Stephen Colbert at SemEval-2023 Task 5: Using Markup for Classifying Clickbait. In 17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop (pp. 1844–1848). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.semeval-1.254

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free