Pre-trained Language Model for Web-scale Retrieval in Baidu Search

40Citations
Citations of this article
88Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Retrieval is a crucial stage in web search that identifies a small set of query-relevant candidates from a billion-scale corpus. Discovering more semantically-related candidates in the retrieval stage is very promising to expose more high-quality results to the end users. However, it still remains non-trivial challenges of building and deploying effective retrieval models for semantic matching in real search engine. In this paper, we describe the retrieval system that we developed and deployed in Baidu Search. The system exploits the recent state-of-the-art Chinese pretrained language model, namely Enhanced Representation through kNowledge IntEgration (ERNIE), which facilitates the system with expressive semantic matching. In particular, we developed an ERNIE-based retrieval model, which is equipped with 1) expressive Transformer-based semantic encoders, and 2) a comprehensive multi-stage training paradigm. More importantly, we present a practical system workflow for deploying the model in web-scale retrieval. Eventually, the system is fully deployed into production, where rigorous offline and online experiments were conducted. The results show that the system can perform high-quality candidate retrieval, especially for those tail queries with uncommon demands. Overall, the new retrieval system facilitated by pretrained language model (i.e., ERNIE) can largely improve the usability and applicability of our search engine.

Cite

CITATION STYLE

APA

Liu, Y., Lu, W., Cheng, S., Shi, D., Wang, S., Cheng, Z., & Yin, D. (2021). Pre-trained Language Model for Web-scale Retrieval in Baidu Search. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 3365–3375). Association for Computing Machinery. https://doi.org/10.1145/3447548.3467149

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free