A semi-supervised topic-driven approach for clustering textual answers to survey questions

Hui Yang; Ajay Mysore; Sharonda Wallace

Conference Proceedings

A semi-supervised topic-driven approach for clustering textual answers to survey questions

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5678 LNAI 374-385

DOI: 10.1007/978-3-642-03348-3_36

0Citations

3Readers

Get full text

Abstract

We propose an algorithm to effectively cluster a specific type of text documents: textual responses gathered through a survey system. Due to the peculiar features exhibited in such responses (e.g., short in length, rich in outliers, and diverse in categories), traditional unsupervised and semi-supervised clustering*techniques are challenged to achieve satisfactory performance as demanded by a survey task. We address this issue by proposing a semi-supervised, topic-driven approach. It first employs an unsupervised algorithm to generate a preliminary clustering schema for all the answers to a question. A human expert then uses this schema to identify the major topics in these answers. Finally, a topic-driven clustering algorithm is adopted to obtain an improved clustering schema. We evaluated this approach using five questions in a survey we recently conducted in the U.S. The results demonstrate that this approach can lead to significant improvement in clustering quality. © 2009 Springer.

Cite

CITATION STYLE

APA

Yang, H., Mysore, A., & Wallace, S. (2009). A semi-supervised topic-driven approach for clustering textual answers to survey questions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5678 LNAI, pp. 374–385). https://doi.org/10.1007/978-3-642-03348-3_36

A semi-supervised topic-driven approach for clustering textual answers to survey questions

Abstract

Cite

Register to see more suggestions