Vit Large 23 (vocals, instrum)
Experimental model VitLarge23 based on Vision Transformers. In terms of metrics, it is slightly inferior to the MDX23C, but may work better in some cases.
Quality table
| Algorithm name | Multisong dataset | Synth dataset | MDX23 Leaderboard |
||
| SDR Vocals | SDR Instrumental | SDR Vocals | SDR Instrumental | SDR Vocals | |
| Vit Large 23 (512px) v1 | 9.78 | 16.09 | 12.33 | 12.03 | 10.47 |
| Vit Large 23 (512px) v2 | 9.90 | 16.20 | 12.38 | 12.08 | --- |