ISNN: Impact sound neural network for audio-visual object classification

Auston Sterling; Justin Wilson; Sam Lowe; Ming C. Lin

Conference ProceedingsOPEN ACCESS

ISNN: Impact sound neural network for audio-visual object classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11219 LNCS 578-595

DOI: 10.1007/978-3-030-01267-0_34

5Citations

113Readers

Abstract

3D object geometry reconstruction remains a challenge when working with transparent, occluded, or highly reflective surfaces. While recent methods classify shape features using raw audio, we present a multimodal neural network optimized for estimating an object’s geometry and material. Our networks use spectrograms of recorded and synthesized object impact sounds and voxelized shape estimates to extend the capabilities of vision-based reconstruction. We evaluate our method on multiple datasets of both recorded and synthesized sounds. We further present an interactive application for real-time scene reconstruction in which a user can strike objects, producing sound that can instantly classify and segment the struck object, even if the object is transparent or visually occluded.

Cite

CITATION STYLE

APA

Sterling, A., Wilson, J., Lowe, S., & Lin, M. C. (2018). ISNN: Impact sound neural network for audio-visual object classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11219 LNCS, pp. 578–595). Springer Verlag. https://doi.org/10.1007/978-3-030-01267-0_34

ISNN: Impact sound neural network for audio-visual object classification

Abstract

Cite

Register to see more suggestions