Robust sound event detection
By Mauricio Michel Olvera Zambrano

Michel Olvera, postdoctoral researcher within the Audio Data Analysis and Signal Processing (ADASP) group at Télécom Paris, will give a talk about robust sound event detection:


From industry to general interest applications, computational analysis of sound scenes and events allows us to interpret the continuous flow of everyday sounds. One of the main degradations encountered when moving from lab conditions to the real world is due to the fact that sound scenes are not composed of isolated events but of multiple simultaneous events. Differences between training and test conditions also often arise due to extrinsic factors such as the choice of recording hardware and microphone positions, as well as intrinsic factors of sound events, such as their frequency of occurrence, duration and variability. In this talk, we’ll discuss problems of wide interest in audio analysis tasks focusing on how to achieve robustness in real scenarios. Firstly, we explore the separation of ambient sounds in a practical setting in which multiple short duration sound events with fast varying spectral characteristics (i.e., foreground sounds) occur simultaneously with background stationary sounds. Secondly, we investigate how to improve the robustness of audio analysis systems under mismatched training and test conditions. We explore two distinct tasks: acoustic scene classification with mismatched recording devices and training of sound event detection systems with synthetic and real data.


Michel Olvera is a postdoctoral researcher within the Audio Data Analysis and Signal Processing (ADASP) group at Télécom Paris. Prior to joining Télécom Paris, Michel was a part of the Multispeech research team at INRIA Nancy Grand-Est. Michel completed his Ph.D. in 2022 from the Université de Lorraine in Nancy, France, where he developed domain adaptation strategies to improve audio analysis tasks in mismatched conditions. Before, Michel earned his B.S. in Telecommunications Engineering in 2017 and his M.S. in Signal Processing in 2019, both from the National Autonomous University of Mexico (UNAM). His research interests lie in the intersection of machine listening and domain adaptation, with a focus on developing new algorithms and technologies for audio analysis and understanding in real-world scenarios.