SpaDE: Improving Sparse Representations using a Dual Document Encoder for First-stage Retrieval

13Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Sparse document representations have been widely used to retrieve relevant documents via exact lexical matching. Owing to the pre-computed inverted index, it supports fast ad-hoc search but incurs the vocabulary mismatch problem. Although recent neural ranking models using pre-trained language models can address this problem, they usually require expensive query inference costs, implying the trade-off between effectiveness and efficiency. Tackling the trade-off, we propose a novel uni-encoder ranking model, Sparse retriever using a Dual document Encoder (SpaDE), learning document representation via the dual encoder. Each encoder plays a central role in (i) adjusting the importance of terms to improve lexical matching and (ii) expanding additional terms to support semantic matching. Furthermore, our co-training strategy trains the dual encoder effectively and avoids unnecessary intervention in training each other. Experimental results on several benchmarks show that SpaDE outperforms existing uni-encoder ranking models.

References Powered by Scopus

Natural Questions: A Benchmark for Question Answering Research

1895Citations
N/AReaders
Get full text

Billion-Scale Similarity Search with GPUs

1755Citations
N/AReaders
Get full text

Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval

1100Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Information Retrieval: Recent Advances and beyond

40Citations
N/AReaders
Get full text

A Unified Framework for Learned Sparse Retrieval

21Citations
N/AReaders
Get full text

Efficient Document-At-A-Time and Score-At-A-Time Query Evaluation for Learned Sparse Representations

12Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Choi, E., Lee, S., Choi, M., Ko, H., Song, Y. I., & Lee, J. (2022). SpaDE: Improving Sparse Representations using a Dual Document Encoder for First-stage Retrieval. In International Conference on Information and Knowledge Management, Proceedings (pp. 272–282). Association for Computing Machinery. https://doi.org/10.1145/3511808.3557456

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 8

80%

Lecturer / Post doc 1

10%

Researcher 1

10%

Readers' Discipline

Tooltip

Computer Science 10

77%

Linguistics 2

15%

Engineering 1

8%

Save time finding and organizing research with Mendeley

Sign up for free