Abstract
In this paper, the problem of conflict detection in audiovisual recordings of political debates is investigated. In contrast to the current state of the art in social signal processing, where only the audio modality is employed for analysing the human non-verbal behavior, we propose to use additionally visual features capturing certain facial behavioral cues such as head nodding, fidgeting, and frowning which are related to conflicts. To this end, a dataset with video excerpts from televised political debates, where conflicts naturally arise, is introduced. The prediction of conflict level (i.e., conflict/nonconflict) is performed by applying the linear support vector machine and the collaborative representation-based classifier onto audio, visual, and audiovisual features. The experimental results demonstrate that the fusion of audio and visual features, outperform the accuracy in conflict detection, obtained by features that resort to a single modality (i.e., either audio or video).
Cite
CITATION STYLE
Panagakis, Y., Zafeiriou, S., & Pantic, M. (2015). Audiovisual conflict detection in political debates. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8925, pp. 306–314). Springer Verlag. https://doi.org/10.1007/978-3-319-16178-5_21
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.