Dataset distillation for Audio-visual tasks

Date:

Mon, 07/29/2024 - 3:00pm - 4:00pm

Location:

CCRMA Classroom

Event Type:

Guest Lecture

Saksham Singh (UT Dallas) joins us to discuss his PhD research.

Data distillation aims to create a condensed dataset that retains the essential information of the entire training data. While recent advances in data distillation techniques have shown remarkable performance on image datasets, their potential in other domains remains largely unexplored. We extend this concept to the audio-visual domain, introducing audio-visual dataset distillation, a task of creating smaller yet representative synthetic datasets that maintain cross-modal semantic associations between audio and visual modalities. To address this, we extend the Distribution Matching approach and introduce additional cross-modal alignment losses. Comprehensive experiments on recognition and cross-modal retrieval tasks demonstrate the representativeness and effective audio-visual alignment of our distilled data.

NOTE: this event is part of the 2024 DL4MIR workshop series (ccrma-mir.github.io); guest speaker talks are open to the broader CCRMA community.

FREE

Open to the Public

Calendar

Search this site:

Fall Courses at CCRMA

Music 1A Music, Mind, and Human Behavior
Music 101 Introduction to Creating Electronic Sounds
Music 192A Foundations in Sound Recording Technology
Music 201 CCRMA Colloquium
Music 220A Foundations of Computer-Generated Sound
Music 223A Composing Electronic Sound Poetry
Music 256A Music, Computing, and Design I: Software Paradigms for Computer Music
Music 319 Research Seminar on Computational Models of Sound Perception
Music 320 Introduction to Audio Signal Processing
Music 351A Research Seminar in Music Perception and Cognition I
Music 451A Auditory EEG Research I

Main menu

Secondary menu

Dataset distillation for Audio-visual tasks

Search this site:

Fall Courses at CCRMA