Music 422
Cross listed as EE 367C.
Did you ever wonder how your MP3 files squeeze so much sound into such a small size? What is the difference between MP3 and AAC? Or which multichannel audio coding format is best for your application?
The need for significant reduction in data rate for wide-band digital audio signal transmission and storage has led to the development of psychoacoustics-based data compression techniques. In this approach, the limitations of human hearing are exploited to remove inaudible components of audio signals. The degree of bit rate reduction achievable without sacrificing perceived quality using these methods greatly exceeds that possible using lossless techniques alone. Perceptual audio coders are currently used in many applications including Digital Radio and Television, Digital Sound on Film, Multimedia/Internet Audio, Mobile Devices, etc.
This class integrates digital signal processing, psychoacoustics, rate/distortion optimization, and programming to provide the basis for understanding and building perceptual audio coding systems. We review the basic principles underlying all the core components of a perceptual audio codec and study the design choices applied in state-of-the-art audio coding schemes, e.g. AC-3 (aka Dolby Digital), Enhanced AC-3, AC-4; MPEG Layers I, II, and III (MP3); MPEG AAC; MPEG-H. In-class demonstrations will allow students to hear the quality of state-of-the-art implementations at varying data rates and, as a final project, you will be programming your own simple perceptual audio coder.