Abstract
Information retrieval aims to find information that meets users’ needs from the corpus. Different needs correspond to different IR tasks such as document retrieval, open-domain question answering, retrieval-based dialogue, and so on, while they share the same schema to estimate the relationship between texts. It indicates that a good IR model can generalize to different tasks and domains. However, previous studies indicate that state-of-the-art neural information retrieval (NIR) models, e.g., pre-trained language models (PLMs) are hard to generalize. It is mainly because the end-to-end fine-tuning paradigm makes the model overemphasize task-specific signals and domain biases but loses the ability to capture generalized essential signals. To address this problem, we propose a novel NIR training framework named NIR-Prompt for retrieval and reranking stages based on the idea of decoupling signal capturing and combination. NIR-Prompt exploits Essential Matching Module (EMM) to capture the essential matching signals and gets the description of tasks by Matching Description Module (MDM). The description is used as task-adaptation information to combine the essential matching signals to adapt to different tasks. Experiments under in-domain multi-task, out-of-domain multitask, and new task adaptation settings show that NIR-Prompt can improve the generalization of PLMs in NIR for both retrieval and reranking stages compared with baselines.
Cite
CITATION STYLE
Xu, S., Pang, L., Shen, H., & Cheng, X. (2023). NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework. ACM Transactions on Information Systems, 42(2). https://doi.org/10.1145/3626092
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.