Constrained co-clustering for textual documents

Yangqiu Song; Shimei Pan; Shixia Liu; Furu Wei; Michelle X. Zhou; Weihong Qian

Conference ProceedingsOPEN ACCESS

Constrained co-clustering for textual documents

Proceedings of the National Conference on Artificial Intelligence (2010) 1 581-586

DOI: 10.1609/aaai.v24i1.7680

N/ACitations

40Readers

Abstract

In this paper, we present a constrained co-clustering approach for clustering textual documents. Our approach combines the benefits of information-theoretic co-clustering and constrained clustering. We use a two-sided hidden Markov random field (HMRF) to model both the document and word constraints. We also develop an alternating expectation maximization (EM) algorithm to optimize the constrained co-clustering model. We have conducted two sets of experiments on a benchmark data set: (1) using human-provided category labels to derive document and word constraints for semi-supervised document clustering, and (2) using automatically extracted named entities to derive document constraints for unsupervised document clustering. Compared to several representative constrained clustering and co-clustering approaches, our approach is shown to be more effective for high-dimensional, sparse text data. Copyright © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Cite

CITATION STYLE

APA

Song, Y., Pan, S., Liu, S., Wei, F., Zhou, M. X., & Qian, W. (2010). Constrained co-clustering for textual documents. In Proceedings of the National Conference on Artificial Intelligence (Vol. 1, pp. 581–586). AI Access Foundation. https://doi.org/10.1609/aaai.v24i1.7680

Readers' Seniority

PhD / Post grad / Masters / Doc 25

76%

Professor / Associate Prof. 5

15%

Researcher 3

Readers' Discipline

Computer Science 24

73%

Engineering 6

18%

Mathematics 2

Chemistry 1

Constrained co-clustering for textual documents

Abstract

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline