New internship position with ADASP: Building a large music dataset.
By A. Quelennec

Our group is hiring a L3 or M1 intern on the topic “Building a large music dataset”.

Important informations

  • Date: March/April/May 2024
  • Duration: 3 to 4 months
  • Place of work: Palaiseau (Paris outskirts), France
  • Supervisors: Aurian Quélennec, Slim Essid
  • Contact: aurian.quelennec@telecom-paris.fr

Problem statement and context

The aim of the internship will be to build a large music dataset from existing datasets (among others [1, 2]) to train deep learning models. This will involve: controlling audio quality, choosing standardization criteria, creating dataset statistics and descriptors, etc…

This internship will enable you to put your Python skills to use, apply your knowledge of signal processing and learn the basics of deep learning. Depending on the progress of the project, there will be the possibility of going further by looking at different augmentation techniques that could be applied to the audios in the dataset.

References

[1] Kirell Benzi, et Al. A Dataset For Music Analysis, ISMIR 2017

[2] AudioSet, A large-scale dataset of manually annotated audio events. https://research.google.com/audioset/

Candidate profile

  • They are currently finishing an L3 or M1 degree in Data Science, Machine learning, Signal Processing, or Speech/Audio/Music processing.
  • Strong skills in Python and a good theoretical and practical knowledge of signal processing. Knowing Pytorch is a plus.