Non-negative Hidden Markov Modeling of Audio with Application to Source Separation
Title | Non-negative Hidden Markov Modeling of Audio with Application to Source Separation |
Publication Type | Conference Paper |
Year of Publication | 2010 |
Authors | Mysore, G. J., P. Smaragdis, and B. Raj |
Conference Name | International Conference on Latent Variable Analysis and Signal Separation (LVA / ICA) |
Date Published | 09/2010 |
Conference Location | St. Malo, France |
Abstract | In recent years, there has been a great deal of work in modeling audio using non-negative matrix factorization and its probabilistic counterparts as they yield rich models that are very useful for source separation and automatic music transcription. Given a sound source, these algorithms learn a dictionary of spectral vectors to best explain it. This dictionary is however learned in a manner that disregards a very important aspect of sound, its temporal structure. We propose a novel algorithm, the non-negative hidden Markov model (N-HMM), that extends the aforementioned models by jointly learning several small spectral dictionaries as well as a Markov chain that describes the structure of changes between these dictionaries. We also extend this algorithm to the non-negative factorial hidden Markov model (N-FHMM) to model sound mixtures, and demonstrate that it yields superior performance in single channel source separation tasks. |
URL | https://ccrma.stanford.edu/~gautham/Site/NFHMM.html |