Learning to infer public emotions from large-scale networked voice data

6Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Emotions are increasingly and controversially central to our public life. Compared to text or image data, voice is the most natural and direct way to express ones' emotions in real-time. With the increasing adoption of smart phone voice dialogue applications (e.g., Siri and Sogou Voice Assistant), the large-scale networked voice data can help us better quantitatively understand the emotional world we live in. In this paper, we study the problem of inferring public emotions from large-scale networked voice data. In particular, we first investigate the primary emotions and the underlying emotion patterns in human-mobile voice communication. Then we propose a partially-labeled factor graph model (PFG) to incorporate both acoustic features (e.g., energy, f0, MFCC, LFPC) and correlation features (e.g., individual consistency, time associativity, environment similarity) to automatically infer emotions. We evaluate the proposed model on a real dataset from Sogou Voice Assistant application. The experimental results verify the effectiveness of the proposed model. © 2014 Springer International Publishing.

Cite

CITATION STYLE

APA

Ren, Z., Jia, J., Cai, L., Zhang, K., & Tang, J. (2014). Learning to infer public emotions from large-scale networked voice data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8325 LNCS, pp. 327–339). https://doi.org/10.1007/978-3-319-04114-8_28

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free