Multi-modal data generation with a deep metric variational autoencoder

  • Sundgaard J
  • Hannemose M
  • Laugesen S
  • et al.
N/ACitations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

We present a deep metric variational autoencoder for multi-modal data generation. The variational autoencoder employs triplet loss in the latent space, which allows for conditional data generation by sampling new embeddings in the latent space within each class cluster. The approach is evaluated on a multi-modal dataset consisting of otoscopy images of the tympanic membrane with corresponding wideband tympanometry measurements. The modalities in this dataset are correlated, as they represent different aspects of the state of the middle ear, but they do not present a direct pixel-to-pixel correlation. The approach shows promising results for the conditional generation of pairs of images and tympanograms, and will allow for efficient data augmentation of data from multi-modal sources.

Cite

CITATION STYLE

APA

Sundgaard, J. V., Hannemose, M. R., Laugesen, S., Bray, P., Harte, J., Kamide, Y., … Christensen, A. N. (2023). Multi-modal data generation with a deep metric variational autoencoder. Proceedings of the Northern Lights Deep Learning Workshop, 4. https://doi.org/10.7557/18.6803

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free