In this paper we analyze different labeling strategies and their impact on speaker change detection rates. We explore binary, linear fuzzy, quadratic and Gaussian labeling functions. We come to the conclusion that the labeling function is very important and the linear variant outperforms the rest. We also add phase information from the spectrum to the input of our convolutional neural network. Experiments show that even though the phase is informative its benefit is negligible and may be omitted. In the experiments we use a coverage-purity measure which is independent on tolerance parameters.
CITATION STYLE
Hrúz, M., & Salajka, P. (2017). Phase analysis and labeling strategies in a cnn-based speaker change detection system. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10458 LNAI, pp. 613–622). Springer Verlag. https://doi.org/10.1007/978-3-319-66429-3_61
Mendeley helps you to discover research relevant for your work.