Skip to main content
School of Electronic Engineering and Computer Science

Xavier D'Cruz

Xavier

PhD Student

Email: x.m.dcruz@qmul.ac.uk

Profile

Project title:

Beyond Supervised Deep Learning For Musical Audio: Can pretraining with unlabelled data improve deep networks for Music Source Separation?

Abstract:

Pretraining on large unlabelled datasets, either in an unsupervised or self-supervised capacity, is a strategy that has proven effective in several areas of deep learning, including Computer Vision (CV) and Natural Language Processing (NLP). The rationale is that pretraining on cheap unlabelled data is an effective alternative to the costly gathering of labelled data, as networks can disentangle features and reach a ""good"" initialization state that supports better generalization for the downstream task that is eventually trained for using supervised learning.

I propose to investigate the effectiveness of this strategy for music-related tasks, with the goal of leveraging large amounts of unlabelled audio data to train generic models with downstream applications to specific tasks, potentially achieving better results with less labelled data. This will involve attempting to learn rich feature representations on unlabelled data with Denoising Diffusion networks, and then fine-tuning these encodings for the task of Audio Source Separation. While Music Source Separation will be the specific focus, this approach is general enough to apply to other tasks with minimal modifications such as Audio Classification, Music Information Retrieval, Musical Style Transfer, or Biometric Voice Recognition, which could form the basis of future work incorporating multi-task learning.

 

Research

Back to top