MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised TrainingCCF none

09/2021 – 07/2022

Research Assistant, Supervised by Prof. Richard Stern, Carnegie Mellon University

  • Constructed 2-layer learnable front ends in Temporal Modulation Neural Network (TMNN) that combines Mel-like data-driven front ends and temporal modulation filters.
  • Examined the proposed front ends surpass state-of-the-art (SOTA) methods on the MagnaTagATune dataset in automatic music tagging, and they are also helpful for keyword spotting on speech commands.
  • Analysis of the model performance among tags with different genres and instrument tags.
Avatar
马英浩 (Nicolaus) MA Yinghao
PhD Student in AI & Music

MA Yinghao, PhD student in C4DM, QMUL. Research interests include music information retireval, self-supervised learning, music-related multimodal machine learning, and audio signal processing and matter.

Related