Multi-attention recurrent network for human communication comprehension

Amir Zadeh; Prateek Vij; Paul Pu Liang; Erik Cambria; Soujanya Poria; Louis Philippe Morency

Conference ProceedingsOPEN ACCESS

Multi-attention recurrent network for human communication comprehension

32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (2018) 5642-5649

DOI: 10.1609/aaai.v32i1.12024

351Citations

285Readers

Abstract

Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic modality) to convey our intentions. Humans easily process and understand face-to-face communication, however, comprehending this form of communication remains a significant challenge for Artificial Intelligence (AI). AI must understand each modality and the interactions between them that shape the communication. In this paper, we present a novel neural architecture for understanding human communication called the Multi-attention Recurrent Network (MARN). The main strength of our model comes from discovering interactions between modalities through time using a neural component called the Multi-attention Block (MAB) and storing them in the hybrid memory of a recurrent component called the Long-short Term Hybrid Memory (LSTHM). We perform extensive comparisons on six publicly available datasets for multimodal sentiment analysis, speaker trait recognition and emotion recognition. MARN shows state-of-the-art results performance in all the datasets.

Cite

CITATION STYLE

APA

Zadeh, A., Vij, P., Liang, P. P., Cambria, E., Poria, S., & Morency, L. P. (2018). Multi-attention recurrent network for human communication comprehension. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 5642–5649). AAAI press. https://doi.org/10.1609/aaai.v32i1.12024

Multi-attention recurrent network for human communication comprehension

Abstract

Cite

Register to see more suggestions