Audio coding standard overview: MPEG4-AAC, HE-AAC, and HE-AAC V2

Yujie Gao

Book Chapter

Audio coding standard overview: MPEG4-AAC, HE-AAC, and HE-AAC V2

Gao Y

Springer US, (2009), 607-627

DOI: 10.1007/978-0-387-78263-8_21

5Citations

2Readers

Get full text

Abstract

Nowadays, Advanced Audio Coding (AAC) becomes one of the most popularly adopted audio formats in mobile society. In this chapter, brief history of MPEG4 AAC decoder family will be introduced, followed by more details for MPEG4-AAC, HE-AAC, and HE-AAC V2 systems. In April 1997, MPEG-2 Advanced Audio Coding (MPEG-2 AAC) [1] compressing algorithm, which takes advantages of some new spectrum processing and compression tools like temporal noise shaping (TNS) [1, 2], became an international standard. Compared to previously existing audio compressing algorithms, the new standard provides outstanding audio quality and exceptional compression ratio and thus achieves lower bit rate in the encoded bitstreams, and gradually becomes one of the new choices of audio codec standards for broadcasting, internet services, and mobile applications. MPEG-4 AAC standard [2] was adopted by the MPEG community in 1999. It is based on MPEG-2 AAC standard and keeps maximum compatibility with existing MPEG-2 AAC standard from bitstream syntax point of view. In other words, generally speaking, MPEG-4 AAC decoder should be able to decode MPEG-2 AAC bitstreams. On the other hand, the new standard adopts further improvements on scalability, error resilience and some additional spectral processing features including Perceptual Noise Substitution (PNS), Long-Term Predictor (LTP), etc. [2]. Therefore, MPEG-2 AAC decoder might face issues while decoding MPEG-4 AAC stream if MPEG-4 specific tools or features are used. In 2003, the MPEG community further standardized High Efficiency AAC (HE-AAC) [2], an extension of AAC algorithm targeting on low bit-rate applications with higher coding efficiency. HE-AAC adopts a new tool called Spectral Band Replication (SBR) [2, 3], which can reconstruct high-frequency band output based on low-frequency band data and some side information. In 2004, HE-AAC Version 2 (HE-AAC V2) [4] was standardized by the MPEG community. It uses Parametric Stereo tool (PS) [4] on the basis of SBR (HE-AAC), which can reconstruct stereo audio signals based on monaural downmixed signals and limited number of additional stereo parameters. In summary, MPEG-4 AAC, HE-AAC, and HE-AAC V2 make up the AAC decoder family. Based on different compression efficiency requirements, the most appropriate one from them can be chosen to achieve the best compression ratio with required audio quality. The concept of audio object type and profile has been playing a very important role in AAC coding standards. As specified in the MPEG-2 and MPEG-4 ISO specifications, AAC comes in different "flavors" which gives maximum flexibility to different applications and usage models. In MPEG-2 standard, they are called Profiles [1]. While in MPEG-4, they also include Audio Object Types or AOT in short [5]. The AAC standards support tens of these flavors, by adopting different optional tools in encoding or decoding process. AAC compressors family consists of all these different flavors and therefore is suitable for a broad range of different applications. However, those different audio object types are not necessarily compatible to each other. In MPEG-2 AAC spec, three profiles can be supported: Main Profile, Low Complexity Profile (LC), and Scalable Sampling Rate Profile (SSR) [1]. While in MPEG-4 AAC, multiple audio object types are supported. Some of them are almost the counterpart of corresponding MPEG-2 AAC profiles, for example, AAC Main object, AAC-LC object, and AAC-SSR object [5]. The bitstream syntax of the above MPEG-4 audio object types is very similar to their corresponding counterpart of the MPEG-2 profiles, except that MPEG-4 bitstream might have PNS-related data. Thus, an MPEG-4 AAC decoder which supports above audio object types can parse and decode corresponding MPEG-2 profile bitstream, while a specific MPEG-2 profile decoder can also parse its MPEG-4 counterpart object only if the stream does not contain any PNS information. Some popularly used MPEG-4 AAC audio object types in mobile society include MPEG-4 AAC LC object, MPEG-4 AAC LTP object, MPEG-4 ER AAC LC (Error Resilient AAC Low Complexity) object, and SBR-related objects [5]. All of them are targeting on one or multiple goals key important to mobile applications, such as: low complexity and low power with good audio quality, low bit rate, suitable channel settings, and error robustness, etc. The MPEG-4 AAC LC object is the basic audio object type and minimum requirement for AAC decoding, just like the MPEG-2 AAC LC profile. The MPEG-4 LTP object type adds long-term prediction (LTP) on top of basic MPEG-4 LC object. LTP takes advantages of the redundancy between successive frames in a clear pitched audio signal, to achieve lower bit rate in encoding such audio source. As it is based on LC object, the MPEG-4 AAC LTP object compatible decoder can smoothly decode both MPEG-2 LC profile and MPEG-4 LC object type. The ER AAC LC object type is the combination of error resilience functionality and the AAC LC object. In some mobile services like streaming and broadcasting, bitstreams often contains bit errors after going through transmission channels. The error resilience tools provide more protection on important side information or spectrum information encoded in the bitstreams and thus can improve audio quality for bitstream error cases. As additional information needs to be packed into bitstreams to achieve error resilience, the bitstream syntax of ER object type is very different from the others and thus not compatible with non-ER object either. Therefore, a non-ER object decoder is not able to decode ER bitstreams. The SBR-related objects can cover multiple audio object types with different AOT values. Basically, they are adopting SBR tool on top of other MPEG-4 AAC audio object types, like Main, LC, and LTP, etc., respectively [5]. The rest of this chapter is organized into as follows. The next three sections will present more details of MPEG-4 AAC, HE-AAC, and HE-AAC V2, respectively. At the end of this chapter, some conclusions and further discussions will be made. © 2009 Springer Science+Business Media, LLC.

Cite

CITATION STYLE

APA

Gao, Y. (2009). Audio coding standard overview: MPEG4-AAC, HE-AAC, and HE-AAC V2. In Mobile Multimedia Broadcasting Standards: Technology and Practice (pp. 607–627). Springer US. https://doi.org/10.1007/978-0-387-78263-8_21

Audio coding standard overview: MPEG4-AAC, HE-AAC, and HE-AAC V2

Abstract

Cite

Register to see more suggestions