Python Programs and Book for building an audio coder and for deep learning for audio
Date:
Fri, 10/07/2022 - 3:30pm - 4:20pm
Event Type:
DSP Seminar Abstract: Audio coding became a ubiquitous tool for transmitting and storing audio signals, for instance as part of high quality teleconferencing, like with "FaceTime" or similar, or for listening to music in almost every way. A recent tool for adaptive audio processing is deep learning, which so far is mostly used for image and speech processing, but increasingly also for audio processing. All of these can be implemented and experimented with in Python, which allows for fast prototyping and also is the programming language for deep learning. This talk will present in a way “what I did during my pandemic”, Python tools and examples for building an audio coder, and examples for my tutorial on "Deep Learning for Audio". These tools are in public Github repositories together with Python Colab notebook descriptions, which is described in my new book on <b>Filter Banks and Audio coding - Compressing Audio Signals Using Python</b> (Springer), and in videos on YouTube:
https://github.com/TUIlmenauAMS/AudioCoding_Tutorials
https://github.com/TUIlmenauAMS/Python-Audio-Coder
https://github.com/TUIlmenauAMS/AES_Tutorial_2021
Bio: Gerald Schuller is a full professor at the Institute for Media Technology of the Technical University of Ilmenau, since 2008. He was head of the Audio Coding for Special Applications group of the Fraunhofer Institute for Digital Media Technology in Ilmenau, Germany, since January 2002 until 2008, and is now a member of Fraunhofer IDMT. Before joining the Fraunhofer Institute, he was a Member of Technical Staff at Bell Laboratories, Lucent Technologies, and Agere Systems, a Lucent Spin-off, from 1998 to 2001. There he worked in the Multimedia Communications Research Laboratory. He received his Diplom degree in Electrical Engineering from the Technical University of Berlin in 1989, and his Ph.D. (Dr.-Ing.) degree from the University of Hanover in 1997, studied at the Massachusetts Institute of Technology in 1989/90 and at the Georgia Institute of Technology in 1993. He was Associate Editor of the IEEE Transactions on Speech and Audio Processing from 2002 until 2006, and the IEEE Transactions on Signal Processing from 2006 to 2009, and of the IEEE Transactions on Multimedia from 2008 to 2011. He is recipient of the 2006 IEEE Best Paper Award in the Audio and Electroacoustics Area. His research interests are in filter banks, audio coding, music signal processing, and deep learning for multimedia. He is probably best known for his work on low delay filter banks, which became part of the MPEG-4 ELD-AAC audio coding standard, which is now part of the iOS and Android operating systems and is used for instance in the FaceTime application.
Course Info: This seminar is both
- Music 322 (Music/Audio Signal Processing Research Overviews Seminar) and
- Music 319 (Hearing Seminar)
FREE
Open to the Public