MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised TrainingCCF none
09/2021 – 07/2022
Research Assistant, Supervised by Prof. Richard Stern, Carnegie Mellon University
- Constructed 2-layer learnable front ends in Temporal Modulation Neural Network (TMNN) that combines Mel-like data-driven front ends and temporal modulation filters.
- Examined the proposed front ends surpass state-of-the-art (SOTA) methods on the MagnaTagATune dataset in automatic music tagging, and they are also helpful for keyword spotting on speech commands.
- Analysis of the model performance among tags with different genres and instrument tags.