Skip to main content
School of Electronic Engineering and Computer Science

Profile

Project title:

Multi-modal Learning for Music Understanding

Abstract:

Music involves not only auditory experience, but also is deeply integrated with visible performance. When playing a musical instrument, hand pose is highly synchronized with music sounds. With 3D hand pose estimation (3D HPE) we can accurately quantify the fingers' movement by detecting the positions and orientations of the hands from images or videos, facilitating a wide range of applications in music understanding. Although there has been rapid progress in 3D HPE, it still remains a challenging task with occlusion, low resolution and fast finger movements, especially when playing instruments which require more dexterous gestures and high-level skills. To address these problems, the proposed study will focus on taking music sounds as additional input and exploring a novel multimodal fusion  strategy to integrate the inner correlation between hand pose and sounds. As a result, this research will promote performance in music understanding and stimulate more innovative applications in industry.

C4DM theme affiliation:

Augmented Instruments, Music Cognition, Music Informatics

 

Research

Research Interests:

Multi-modal learning, music video understanding    

Back to top