4D Audio-Visual Learning: A Visual Perspective of Sound Propagation and Production

Date:

Thu, 08/01/2024 - 3:00pm - 4:00pm

Location:

CCRMA Classroom

Event Type:

Guest Lecture

Changan Chen (Stanford) joins us to discuss his research.

Humans use multiple modalities to perceive the world, including vision, sound, touch, and smell. Among them, vision and sound are two of the most important modalities that naturally co-occur. Recent works have been exploring this natural correspondence between sight and sound, which are however mainly object-centric, i.e., the semantic relations between objects and the sounds they make. While exciting, the correspondence with the surrounding 3D space is often overlooked. For example, we hear the same sound differently in different environments or even different locations in the same environment. In this talk, I present 4D audio-visual learning, which learns the correspondence between sight and sounds in spaces, providing a visual perspective of sound propagation and sound production. More specifically, I focus on four topics in this direction: simulating sounds in spaces, navigating with sounds in spaces, synthesizing sounds in spaces and learning action sounds in spaces. Throughout these topics, I use vision as the main bridge to connect audio and scene understanding and show promising results in building fundamental simulation platforms, enabling multimodal embodied navigation, providing faithful multimodal synthesis in 3D environments, and learning how actions sound from in-the-wild egocentric videos. I show results on real videos and real-world environments, as well as simulation. In the last part of my talk, I will discuss potential research that remains to be explored in the future for 4D audio-visual learning.

NOTE: this event is part of the DL4MIR workshop series (ccrma-mir.github.io); guest speaker talks are open to the broader CCRMA community.

FREE

Open to the Public

Calendar

Search this site:

Fall Courses at CCRMA

Music 1A Music, Mind, and Human Behavior
Music 101 Introduction to Creating Electronic Sounds
Music 192A Foundations in Sound Recording Technology
Music 201 CCRMA Colloquium
Music 220A Foundations of Computer-Generated Sound
Music 223A Composing Electronic Sound Poetry
Music 256A Music, Computing, and Design I: Software Paradigms for Computer Music
Music 319 Research Seminar on Computational Models of Sound Perception
Music 320 Introduction to Audio Signal Processing
Music 351A Research Seminar in Music Perception and Cognition I
Music 451A Auditory EEG Research I

Main menu

Secondary menu

4D Audio-Visual Learning: A Visual Perspective of Sound Propagation and Production

Search this site:

Fall Courses at CCRMA