In this paper, we define features that can be computed along audio signals in order to assess the level of auditory attention on a normalized scale, i.e. between 0 and 1. The proposed features are derived from a time-frequency representation of audio signals and highlight salient regions such as regions with high loudness, temporal and frequency contrasts. Normalized auditory attention levels can be used to detect sudden and unexpected changes of audio textures and to focus the attention of a surveillance operator to sound segments of interest in audio streams that are monitored. The proposed algorithms have been tested on audio material consisting of security-relevant audio events (e.g., gun shot, glass breaking, womans scream, siren sound, etc) embedded in sound ambiences in public places (e.g., airport hall, metro station, subway train, sport stadium, etc).
CITATION STYLE
Couvreur, L., Bettens, F., Hancq, J., & Mancas, M. (2008). Normalized auditory attention levels for automatic audio surveillance. In WIT Transactions on Information and Communication Technologies (Vol. 39, pp. 453–462). https://doi.org/10.2495/RISK080441
Mendeley helps you to discover research relevant for your work.