New Master internship position with ADASP: Improving Urban Sound Event Detection with Unsupervised Source Separation
By F. Angulo
Our group is hiring a Master intern on the topic “Improving Urban Sound Event Detection with Unsupervised Source Separation”.
- Date: March/April 2023
- Duration: 5 to 6 months
- Place of work: Palaiseau (Paris outskirts), France
- Remuneration: 600€/month
- Supervisors: Florian Angulo, Slim Essid, Geoffroy Peeters
- Contact: email@example.com
Problem statement and context
Sound event detection (SED) consists in predicting automatically which sound class occurs in an audio recording and when. It has promising applications in urban noise pollution monitoring . SED in real conditions involve complex soundscapes with overlapping sound events and varying sound levels. Several works proposed to use unsupervised sound separation to improve SED systems under those challenging conditions . Recently, it was successfully applied in bird sound event detection . This motivates us to investigate how well this applies to general-purpose urban sound event detection.
In this internship, linked to a PhD thesis project, we propose to :
- re-implement and tune an efficient unsupervised urban sound source separation system based on state-of-the-art deep neural networks  and training strategies .
- Use this system jointly with (provided) SED systems on multiple urban sound datasets  and analyse its performances in comparison with SED systems applied directly on the audio (without separating sources), as in .
- Explore self-training strategies to improve the joint separation-classification system .
 Cartwright et. al. “SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context.” DCASE, 2020
 Turpault et. al. “Improving Sound Event Detection In Domestic Environments Using Sound Separation.” DCASE, 2020
 Denton et. al. “Improving Bird Classification with Unsupervised Sound Separation”. ICASSP, 2022
 Tzinis et.al. “Compute and Memory Efficient Universal Sound Source Separation.” Journal of Signal Processing Systems, 2022.
 Wisdom et. al. “Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation.” WASPAA, 2021
 Pishdadian et al. “Finding Strength in Weakness: Learning to Separate Sounds With Weak Supervision.” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2019
 Zhang et. al. « Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation ». CVPR, 2022
- They are currently finishing an M2 degree in Data Science, Machine learning, Signal Processing, or Speech/Audio/Music processing.
- Strong skills in Python and a good theoretical and practical knowledge of deep learning (using Pytorch) are required.