Latent-Space Data Augmentation for Visually-Grounded Language Understanding

Aly Magassouba; Komei Sugiura; Hisashi Kawai

Conference Proceedings

Latent-Space Data Augmentation for Visually-Grounded Language Understanding

Advances in Intelligent Systems and Computing (2020) 1128 AISC 179-187

DOI: 10.1007/978-3-030-39878-1_17

0Citations

1Readers

Get full text

Abstract

This is an extension from a selected paper from JSAI2019. In this paper, we study data augmentation for visually-grounded language understanding in the context of picking task. A typical picking task consists of predicting a target object specified by an ambiguous instruction,e.g., “Pick up the yellow toy near the bottle”. We specifically show that existing methods for understanding such an instruction can be improved by data augmentation. More explicitly, MCTM [1] and MTCM-GAN [2] show better results with data augmentation when specifically considering latent space features instead of raw features. Additionally our results show that latent-space data augmentation can improve better a network accuracy than regularization methods.

Author supplied keywords

Cite

CITATION STYLE

APA

Magassouba, A., Sugiura, K., & Kawai, H. (2020). Latent-Space Data Augmentation for Visually-Grounded Language Understanding. In Advances in Intelligent Systems and Computing (Vol. 1128 AISC, pp. 179–187). Springer. https://doi.org/10.1007/978-3-030-39878-1_17

Latent-Space Data Augmentation for Visually-Grounded Language Understanding

Abstract

Author supplied keywords

Cite

Register to see more suggestions