Large-Scale Pretrained Model for Self-Supervised Music Audio Representation Learning

Abstract: Self-supervised learning technique is an under-explored topic for music audio due to the challenge of designing an appropriate training paradigm. We hence propose MAP-MERT, a large-scale music audio pre-trained model for general music understanding. We achieve performance that is comparable to the state-of-the-art pre-trained model Jukebox using less than 2% of parameters.

Avatar
马英浩 (Nicolaus) MA Yinghao
PhD Student in AI & Music

MA Yinghao, PhD student in C4DM, QMUL. Research interests include music information retireval, self-supervised learning, music-related multimodal machine learning, and audio signal processing and matter.

Related