Algorithm info: Test of https://github.com/facebookresearch/sam-audio
Model version: facebook/sam-audio-large
Prompts: extract vocals, extract bass, extract drums.
Note 1: Sample rate was converted from 44100 to 48000 and back after extraction
Note 2: Each channel was processed independently since SAM Audio works with mono.
Metrics:
Metric sdr for vocals: 2.0977
Metric si_sdr for vocals: -3.7302
Metric l1_freq for vocals: 19.3237
Metric log_wmse for vocals: 6.3669
Metric aura_stft for vocals: 4.9112
Metric aura_mrstft for vocals: 6.3615
Metric bleedless for vocals: 14.3590
Metric fullness for vocals: 12.3817
Metric sdr for bass: 7.2636
Metric si_sdr for bass: 6.3368
Metric l1_freq for bass: 43.5531
Metric log_wmse for bass: 13.9581
Metric aura_stft for bass: 5.1388
Metric aura_mrstft for bass: 5.2803
Metric bleedless for bass: 18.4478
Metric fullness for bass: 20.5992
Metric sdr for drums: 6.4736
Metric si_sdr for drums: 4.2409
Metric l1_freq for drums: 26.1768
Metric log_wmse for drums: 9.9466
Metric aura_stft for drums: 6.2540
Metric aura_mrstft for drums: 6.9321
Metric bleedless for drums: 20.1012
Metric fullness for drums: 14.3422
Date added: 2025-12-20 |