Automatic Environmental Sound Recognition: Performance Versus Computational Cost

Siddharth Sigtia; Adam M. Stark; Sacha Krstulović; Mark D. Plumbley

Journal Article

Automatic Environmental Sound Recognition: Performance Versus Computational Cost

IEEE/ACM Transactions on Audio Speech and Language Processing (2016) 24(11) 2096-2107

DOI: 10.1109/TASLP.2016.2592698

74Citations

137Readers

Get full text

Abstract

In the context of the Internet of Things, sound sensing applications are required to run on embedded platforms where notions of product pricing and form factor impose hard constraints on the available computing power. Whereas Automatic Environmental Sound Recognition (AESR) algorithms are most often developed with limited consideration for computational cost, this paper seeks which AESR algorithm can make the most of a limited amount of computing power by comparing the sound classification performance as a function of its computational cost. Results suggest that Deep Neural Networks yield the best ratio of sound classification accuracy across a range of computational costs, while Gaussian Mixture Models offer a reasonable accuracy at a consistently small cost, and Support Vector Machines stand between both in terms of compromise between accuracy and computational cost.

Author supplied keywords

Cite

CITATION STYLE

APA

Sigtia, S., Stark, A. M., Krstulović, S., & Plumbley, M. D. (2016). Automatic Environmental Sound Recognition: Performance Versus Computational Cost. IEEE/ACM Transactions on Audio Speech and Language Processing, 24(11), 2096–2107. https://doi.org/10.1109/TASLP.2016.2592698

Automatic Environmental Sound Recognition: Performance Versus Computational Cost

Abstract

Author supplied keywords

Cite

Register to see more suggestions