Learning to mask ambient noise with music using psychoacoustics and deep learning
By C. Berger
Our group is hiring a M2 intern on the topic “Learning to mask ambient noise with music using psychoacoustics and deep learning”.
Important informations
- Date: March - September
- Duration: 6 months
- Place of work: Palaiseau (Paris outskirts), France
- Supervisors: Clémentine Berger, Slim Essid, Roland Badeau
- Wage : 4.35e / hour net
- Contact: clementine.berger@telecom-paris.fr
Problem statement and context
Listening to music in noisy environments may negatively impact the listening experience as thesurrounding noise interferes with the perception of the music, partially masking some spectral components of the audio content [1]. A part of the music may even be completely concealed by the noise due to simultaneous masking. For a given audio signal, this effect is quantified using masking thresholds, which indicate the level below which another signal becomes inaudible within specific frequency bands due to the presence of the first signal [2,3].
Thus, music rendering systems have been developed over the years to enhance listening comfort, initially in automotive environments [4,5] and later for more general contexts and personal devices such as headphones or earphones [6,7,8].
One approach involves volume adjustments and compression to increase the loudness of the music when it is masked or partially masked by noise [5,8]. Similarly, adaptive perceptual equalizers have been proposed to restore the original loudness of the music signal when it is affected by ambient noise [4,6].
However, the masking effect works both ways: music may also be used to mask ambient noise.
The goal of this internship is to develop methods for perceptually masking noise with music, for instance in the context of headphones/earbuds, using psychoacoustic models and deep learning.
References
[1] Brian CJ Moore, Brian R Glasberg, and Thomas Baer. “A model for the prediction of thresholds, loudness, and partial loudness”. In: Journal of the Audio Engineering Society 45.4 (1997), pp. 224–240.
[2] T. Painter and A. Spanias, “Perceptual coding of digital audio,” Proceedings of the IEEE, vol. 88, no. 4, pp. 451–515, 2000.
[3] E. Zwicker and H. Fastl, Psychoacoustics: Facts and Models, 3rd ed., ser. Springer Series in Information Sciences. Springer, 2010, no. 22.
[4] Markus Christoph. “Noise dependent equalization control”. In: Audio Engineering Society Conference: 48th International Conference: Automotive Audio. Audio Engineering Society. 2012
[5] D. Clark, H. Blind, W. Dorfstatter, and E. Geddes, “Compensation for road noise in automotive entertainment systems,” SAE Transactions, pp. 560–565, 1987.
[6] J. Rämö, V. Välimäki, and M. Tikander, “Perceptual headphone equalization for mitigation of ambient noise,” in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2013, pp. 724–728.
[7] D. M. G. Jr, C. B. Ickler, and D. B. Ramsay, “Collaboratively processing audio between headset and source to mask distracting noise,” U.S. Patent US9 503 803B2, 11 22, 2016.
[8] D. M. G. Jr, “Adapted audio masking,” U.S. Patent 8 964 997B2, 02 24, 2015.
Candidate profile
- Master in Computer Science / Mathematics / Signal Processing
- Knowledge in audio signal processing (time‐frequency analysis, filtering) and deep learning
- Programming skills in Python and especially the pytorch and pytorch lightning libraries
- Personal management and self‐organization skills
- Interest in music and in audio processing is a plus