Learning Self-Supervised Multimodal Representations of Human Behaviour

1Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Self-supervised learning of representations has important potential applications in human behaviour understanding. The ability to learn useful representations from large unlabeled datasets by modeling intrinsic properties of the data has been successfully employed in various fields of machine learning, often outperforming transfer learning or fully supervised training. My research interests lie in applying these ideas to multimodal human-centric data. In this extended abstract, I present the direction of research that I have followed during the first half of my PhD, along with ideas and work in progress for the second half. My completed research so far demonstrates the potential of cross-modal self-supervision for audio representation learning, especially on small downstream datasets. I want to explore similar ideas for visual and multimodal representation learning, and apply them to speech and emotion recognition and multimodal question answering.

Cite

CITATION STYLE

APA

Shukla, A. (2020). Learning Self-Supervised Multimodal Representations of Human Behaviour. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 4748–4751). Association for Computing Machinery, Inc. https://doi.org/10.1145/3394171.3416518

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free