In this chapter, we consider the set of tools for audio compression applicable to general audio, such as music, as opposed to specifically speech compression developed under the aegis of the Motion Picture Experts Group, MPEG. Surprisingly, this subject has much to do with psychology, specifically within the field of aural sense perception–psychoacoustics. The phenomena of frequency masking and temporal masking are exploited in a waveform coding approach that makes use of a psychoacoustic model of hearing, with the result generally referred to as perceptual coding. We look in some detail at audio compression as it benefits from psychoacoustics, and how this plays out in MPEG-1 Audio Compression (mp3) and later MPEG audio developments: MPEG-2 and 4 including MPEG Advanced Audio Coding (AAC). We begin the study of psychoacoustics as it applies here with the determination of the equal-loudness relations, which leads to a discussion of frequency masking. Critical Bands are introduced as well as the Bark Unit. Temporal Masking is a familiar phenomenon from our own experience. MPEG Audio is introduced to make use of these properties, along with MPEG Audio Layers including MP3. MPEG-2 AAC (Advanced Audio Coding) is considered next and MPEG-4 Audio is also discussed.
CITATION STYLE
Li, Z.-N., Drew, M. S., & Liu, J. (2014). MPEG Audio Compression (pp. 457–482). https://doi.org/10.1007/978-3-319-05290-8_14
Mendeley helps you to discover research relevant for your work.