Interval semi-supervised LDA: Classifying needles in a haystack

Svetlana Bodrunova; Sergei Koltsov; Olessia Koltsova; Sergey Nikolenko; Anastasia Shimorina

Conference Proceedings

Interval semi-supervised LDA: Classifying needles in a haystack

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8265 LNAI(PART 1) 265-274

DOI: 10.1007/978-3-642-45114-0_21

25Citations

10Readers

Get full text

Abstract

An important text mining problem is to find, in a large collection of texts, documents related to specific topics and then discern further structure among the found texts. This problem is especially important for social sciences, where the purpose is to find the most representative documents for subsequent qualitative interpretation. To solve this problem, we propose an interval semi-supervised LDA approach, in which certain predefined sets of keywords (that define the topics researchers are interested in) are restricted to specific intervals of topic assignments. We present a case study on a Russian LiveJournal dataset aimed at ethnicity discourse analysis. © Springer-Verlag 2013.

Author supplied keywords

Cite

CITATION STYLE

APA

Bodrunova, S., Koltsov, S., Koltsova, O., Nikolenko, S., & Shimorina, A. (2013). Interval semi-supervised LDA: Classifying needles in a haystack. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8265 LNAI, pp. 265–274). https://doi.org/10.1007/978-3-642-45114-0_21

Interval semi-supervised LDA: Classifying needles in a haystack

Abstract

Author supplied keywords

Cite

Register to see more suggestions