Learning to Re-Rank with Contextualized Stopwords

Sebastian Hofstätter; Aldo Lipani; Markus Zlabinger; Allan Hanbury

Conference ProceedingsOPEN ACCESS

Learning to Re-Rank with Contextualized Stopwords

International Conference on Information and Knowledge Management, Proceedings (2020) 2057-2060

DOI: 10.1145/3340531.3412079

4Citations

26Readers

Get full text

Abstract

The use of stopwords has been thoroughly studied in traditional Information Retrieval systems, but remains unexplored in the context of neural models. Neural re-ranking models take the full text of both the query and document into account. Naturally, removing tokens that do not carry relevance information provides us with an opportunity to improve the effectiveness by reducing noise and lower document representation caching-storage requirements. In this work we propose a novel contextualized stopword detection mechanism for neural re-ranking models. This mechanism consists of training a sparse vector in order to filter out document tokens from the ranking decision. This vector is learned end-to-end based on the contextualized document representations, allowing the model to filter terms on a per occurrence basis. This leads to a more explainable model, as it reduces noise. We integrate our component into the state-of-the-art interaction-based TK neural re-ranking model. Our experiments on the MS MARCO passage collection and queries from the TREC 2019 Deep Learning Track show that filtering out traditional stopwords prior to the neural model reduces its effectiveness, while learning to filter out contextualized representations improves it.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Hofstätter, S., Lipani, A., Zlabinger, M., & Hanbury, A. (2020). Learning to Re-Rank with Contextualized Stopwords. In International Conference on Information and Knowledge Management, Proceedings (pp. 2057–2060). Association for Computing Machinery. https://doi.org/10.1145/3340531.3412079

Readers over time

Readers' Seniority

Researcher 14

70%

PhD / Post grad / Masters / Doc 5

25%

Lecturer / Post doc 1

Readers' Discipline

Computer Science 19

86%

Biochemistry, Genetics and Molecular Bi... 1

Business, Management and Accounting 1

Linguistics 1

Learning to Re-Rank with Contextualized Stopwords

Abstract

Author supplied keywords

References Powered by Scopus

Understanding inverse document frequency: On theoretical arguments for IDF

A deep relevance matching model for Ad-hoc retrieval

End-To-end neural ad-hoc ranking with kernel pooling

Cited by Powered by Scopus

Perspectives of non-expert users on cyber security and privacy: An analysis of online discussions on twitter

Introducing Neural Bag of Whole-Words with ColBERTer: Contextualized Late Interactions using Enhanced Reduction

On Approximate Nearest Neighbour Selection for Multi-Stage Dense Retrieval

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline