The Unified Theory of Blind Source Separation Based on Independence, Nonnegativity, and Low-rankness
By K. Yoshii
K. Yoshii presents a unified theory of blind source separation.
Abstract
Nonnegative matrix factorization (NMF) is the most basic factorization method based on nonnegativity and low-rankness for matrix data (e.g., time-frequency data), which has widely been used for single-channel blind source separation (BSS). It approximates a set of nonnegative vectors as the weighted sums of basis nonnegative vectors. Nonnegative tensor factorization (NTF) is a naive multi-dimensional extension of NMF that can deal with tensor data (e.g., time-frequency-channel data). In contrast, we proposed a more mathematically-essential covariance-aware extension of NMF called positive semidefinite tensor factorization (PSDTF) [Yoshii+ 2013] that approximates a set of covariance matrices as the weighted sums of basis covariance matrices. We further proposed a multi-dimensional extension of PSDTF called correlated tensor factorization (CTF) [Yoshii 2018] that approximates a full covariance matrix over a tensor as the sum of Kronecker products of dimension-wise sets of basis covariance matrices. To reduce the computational complexity of CTF, we then proposed FastCTF based on joint diagonalization of basis covariance matrices [Yoshii+ 2018]. FastCTF is a theoretically-ultimate and computationally-feasible single-channel BSS method based on independence, nonnegativity, and low-rankness and can be extended for multi-channel BSS, resulting in FastMCTF [Yoshii+ 2020] that jointly deals with the time, frequency, and spatial covariance matrices. Interestingly, we revealed that the proposed methods includes as their special cases the state-of-the-art single- and multi-channel BSS methods independently developed by different researchers.
Bio
Dr. Kazuyoshi Yoshii received the Ph.D. degree in informatics from Kyoto University, Kyoto, Japan, in 2008. He is currently an Associate Professor at the Graduate School of Informatics, Kyoto University, Kyoto, Japan, and concurrently the Leader of the Sound Scene Understanding Team, RIKEN Center for Advanced Intelligence Project (AIP), Tokyo, Japan. His research interests include music analysis, audio signal processing, and machine learning. More on the speaker’s website.