Multiview Identifiers Enhanced Generative Retrieval

18Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

Instead of simply matching a query to preexisting passages, generative retrieval generates identifier strings of passages as the retrieval target. At a cost, the identifier must be distinctive enough to represent a passage. Current approaches use either a numeric ID or a text piece (such as a title or substrings) as the identifier. However, these identifiers cannot cover a passage's content well. As such, we are motivated to propose a new type of identifier, synthetic identifiers, that are generated based on the content of a passage and could integrate contextualized information that text pieces lack. Furthermore, we simultaneously consider multiview identifiers, including synthetic identifiers, titles, and substrings. These views of identifiers complement each other and facilitate the holistic ranking of passages from multiple perspectives. We conduct a series of experiments on three public datasets, and the results indicate that our proposed approach performs the best in generative retrieval, demonstrating its effectiveness and robustness. The code is released at https://github.com/liyongqi67/MINDER.

Cite

CITATION STYLE

APA

Li, Y., Yang, N., Wang, L., Wei, F., & Li, W. (2023). Multiview Identifiers Enhanced Generative Retrieval. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 6636–6648). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-long.366

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free