In this work, a computationally efficient technique for acoustic events classification is presented. The approach is based on cochleagram structure by identification of dominant time-frequency units. The input signal is splitting into frames, then cochleagram is calculated and masked by the set of masks to determine the most probable audio class. The mask for the given class is calculated using a training set of time aligned events by selecting dominant energy parts in the time–frequency plane. The process of binary mask estimation exploits the thresholding of consecutive cochleagrams, computing the sum, and then final thresholding is applied to the result giving the representation for a particular class. All available masks for all classes are checked in sequence to determine the highest probability of the considered audio event. The proposed technique was verified on a small database of acoustic events specific to the surveillance systems. The results show that such an approach can be used in systems with limited computational resources giving satisfying classification results.
CITATION STYLE
Maka, T. (2019). Computationally Efficient Classification of Audio Events Using Binary Masked Cochleagrams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11538 LNCS, pp. 719–728). Springer Verlag. https://doi.org/10.1007/978-3-030-22744-9_56
Mendeley helps you to discover research relevant for your work.