AutoName: A Corpus-Based Set Naming Framework

Zhiqi Huang; Razieh Rahimi; Puxuan Yu; Jingbo Shang; James Allan

Conference ProceedingsOPEN ACCESS

AutoName: A Corpus-Based Set Naming Framework

SIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (2021) 2102-2105

DOI: 10.1145/3404835.3463100

3Citations

8Readers

Abstract

We propose AutoName, an unsupervised framework that extracts a name for a set of query entities from a large-scale text corpus. Entity-set naming is useful in many tasks related to natural language processing and information retrieval such as session-based and conversational information seeking. Previous studies mainly extract set names from knowledge bases which provide highly reliable entity relations, but suffer from limited coverage of entities and set names that represent broad semantic classes. To address these problems, AutoName generates hypernym-anchored candidate phrases via probing a pre-trained language model and the entities' context in documents. Phrases are then clustered to identify ones that describe common concepts among query entities. Finally, AutoName ranks refined phrases based on the co-occurrences of their words with query entities and the conceptual integrity of their respective clusters. We built a new benchmark dataset for this task, consisting of 130 entity sets with name labels. Experimental results show that AutoName generates coherent and meaningful set names and significantly outperforms all baselines.

Author supplied keywords

Cite

CITATION STYLE

APA

Huang, Z., Rahimi, R., Yu, P., Shang, J., & Allan, J. (2021). AutoName: A Corpus-Based Set Naming Framework. In SIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 2102–2105). Association for Computing Machinery, Inc. https://doi.org/10.1145/3404835.3463100

AutoName: A Corpus-Based Set Naming Framework

Abstract

Author supplied keywords

Cite

Register to see more suggestions