Here, we introduce a classification method for distinguishing between formal and informal dialogues using feature sets based on prosodic data. One such feature set is the raw fundamental frequency values paired with speaker information (i.e. turn-taking). The other feature set we examine is the prosodic labels extracted from the raw F0 values via the ProsoTool algorithm, which is also complemented by turn-taking. We evaluated the two feature sets by comparing the accuracy scores our classification method got, which uses them to classify dialogue-excerpts taken from the HuComTech corpus. With the ProsoTool features we achieved an average accuracy score of $$85.2\%$$, which meant a relative error rate reduction of $$24\%$$ compared to the accuracy scores attained using F0 features. Regardless of the feature set applied, however, our method yields better accuracy scores than those got by human listeners, who only managed to distinguish between formal and informal dialogue to an accuracy level of $$56.5\%$$.
CITATION STYLE
Szekrényes, I., & Kovács, G. (2017). Classification of formal and informal dialogues based on turn-taking and intonation using deep neural networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10458 LNAI, pp. 233–243). Springer Verlag. https://doi.org/10.1007/978-3-319-66429-3_22
Mendeley helps you to discover research relevant for your work.