The paper deals with the problem of automatic stop list generation for processing recognition results of call center recordings, in particular for the purpose of clustering. We propose and test a supervised domain dependent method of automatic stop list generation. The method is based on finding words whose removal increases the dissimilarity between documents in different clusters, and decreases dissimilarity between documents within the same cluster. This approach is shown to be efficient for clustering recognition results of recordings with different quality, both on datasets that contain the same topics as the training dataset, and on datasets containing other topics.
CITATION STYLE
Popova, S., Krivosheeva, T., & Korenevsky, M. (2014). Automatic stop list generation for clustering recognition results of call center recordings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8773, pp. 137–144). Springer Verlag. https://doi.org/10.1007/978-3-319-11581-8_17
Mendeley helps you to discover research relevant for your work.