Python Programs and Book for building an audio coder and for deep learning for audio

Date:

Fri, 10/07/2022 - 3:30pm - 4:20pm

Event Type:

DSP Seminar

Location: CCRMA Classroom [Knoll 217]

Abstract: Audio coding became a ubiquitous tool for transmitting and storing audio signals, for instance as part of high quality teleconferencing, like with "FaceTime" or similar, or for listening to music in almost every way. A recent tool for adaptive audio processing is deep learning, which so far is mostly used for image and speech processing, but increasingly also for audio processing. All of these can be implemented and experimented with in Python, which allows for fast prototyping and also is the programming language for deep learning. This talk will present in a way “what I did during my pandemic”, Python tools and examples for building an audio coder, and examples for my tutorial on "Deep Learning for Audio". These tools are in public Github repositories together with Python Colab notebook descriptions, which is described in my new book on <b>Filter Banks and Audio coding - Compressing Audio Signals Using Python</b> (Springer), and in videos on YouTube:

https://github.com/TUIlmenauAMS/AudioCoding_Tutorials
https://github.com/TUIlmenauAMS/Python-Audio-Coder
https://github.com/TUIlmenauAMS/AES_Tutorial_2021

Bio: Gerald Schuller is a full professor at the Institute for Media Technology of the Technical University of Ilmenau, since 2008. He was head of the Audio Coding for Special Applications group of the Fraunhofer Institute for Digital Media Technology in Ilmenau, Germany, since January 2002 until 2008, and is now a member of Fraunhofer IDMT. Before joining the Fraunhofer Institute, he was a Member of Technical Staff at Bell Laboratories, Lucent Technologies, and Agere Systems, a Lucent Spin-off, from 1998 to 2001. There he worked in the Multimedia Communications Research Laboratory. He received his Diplom degree in Electrical Engineering from the Technical University of Berlin in 1989, and his Ph.D. (Dr.-Ing.) degree from the University of Hanover in 1997, studied at the Massachusetts Institute of Technology in 1989/90 and at the Georgia Institute of Technology in 1993. He was Associate Editor of the IEEE Transactions on Speech and Audio Processing from 2002 until 2006, and the IEEE Transactions on Signal Processing from 2006 to 2009, and of the IEEE Transactions on Multimedia from 2008 to 2011. He is recipient of the 2006 IEEE Best Paper Award in the Audio and Electroacoustics Area. His research interests are in filter banks, audio coding, music signal processing, and deep learning for multimedia. He is probably best known for his work on low delay filter banks, which became part of the MPEG-4 ELD-AAC audio coding standard, which is now part of the iOS and Android operating systems and is used for instance in the FaceTime application.

Course Info: This seminar is both

Music 322 (Music/Audio Signal Processing Research Overviews Seminar) and
Music 319 (Hearing Seminar)

FREE

Open to the Public

Calendar

Search this site:

Fall Courses at CCRMA

Music 1A Music, Mind, and Human Behavior
Music 101 Introduction to Creating Electronic Sounds
Music 192A Foundations in Sound Recording Technology
Music 201 CCRMA Colloquium
Music 220A Foundations of Computer-Generated Sound
Music 223A Composing Electronic Sound Poetry
Music 256A Music, Computing, and Design I: Software Paradigms for Computer Music
Music 319 Research Seminar on Computational Models of Sound Perception
Music 320 Introduction to Audio Signal Processing
Music 351A Research Seminar in Music Perception and Cognition I
Music 451A Auditory EEG Research I

Main menu

Secondary menu

Python Programs and Book for building an audio coder and for deep learning for audio

Search this site:

Fall Courses at CCRMA