Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

75Citations
Citations of this article
103Readers
Mendeley users who have this article in their library.

Abstract

Mining a set of meaningful topics organized into a hierarchy is intuitively appealing since topic correlations are ubiquitous in massive text corpora. To account for potential hierarchical topic structures, hierarchical topic models generalize flat topic models by incorporating latent topic hierarchies into their generative modeling process. However, due to their purely unsupervised nature, the learned topic hierarchy often deviates from users' particular needs or interests. To guide the hierarchical topic discovery process with minimal user supervision, we propose a new task, Hierarchical Topic Mining, which takes a category tree described by category names only, and aims to mine a set of representative terms for each category from a text corpus to help a user comprehend his/her interested topics. We develop a novel joint tree and text embedding method along with a principled optimization procedure that allows simultaneous modeling of the category tree structure and the corpus generative process in the spherical space for effective category-representative term discovery. Our comprehensive experiments show that our model, named JoSH, mines a high-quality set of hierarchical topics with high efficiency and benefits weakly-supervised hierarchical text classification tasks.

References Powered by Scopus

GloVe: Global vectors for word representation

27195Citations
N/AReaders
Get full text

Probabilistic latent semantic indexing

4296Citations
N/AReaders
Get full text

Linguistic regularities in sparse and explicit word representations

442Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations

41Citations
N/AReaders
Get full text

A Survey of Text Classification With Transformers: How Wide? How Large? How Long? How Accurate? How Expensive? How Safe?

37Citations
N/AReaders
Get full text

CoRel: Seed-Guided Topical Taxonomy Construction by Concept Learning and Relation Transferring

34Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Meng, Y., Zhang, Y., Huang, J., Zheng, Y., Zhang, C., & Han, J. (2020). Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1908–1917). Association for Computing Machinery. https://doi.org/10.1145/3394486.3403242

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 40

66%

Researcher 14

23%

Lecturer / Post doc 4

7%

Professor / Associate Prof. 3

5%

Readers' Discipline

Tooltip

Computer Science 52

83%

Engineering 8

13%

Physics and Astronomy 2

3%

Social Sciences 1

2%

Save time finding and organizing research with Mendeley

Sign up for free