| --- |
| license: cc-by-nc-4.0 |
| language: |
| - ja |
| tags: |
| - music |
| - speech |
| - audio |
| - audio-to-audio |
| - a cappella |
| - vocal ensemble |
| datasets: |
| - jaCappella |
| metrics: |
| - SI-SDR |
| --- |
| |
| # DPTNet trained with the jaCappella corpus for vocal ensemble separation |
|
|
| This model was trained by Tomohiko Nakamura using [the codebase](https://github.com/TomohikoNakamura/asteroid_jaCappella)). |
| It was trained on the vocal ensemble separation task of [the jaCappella dataset](https://tomohikonakamura.github.io/jaCappella_corpus/). |
| [The paper](https://doi.org/10.1109/ICASSP49357.2023.10095569) was published in ICASSP 2023 ([arXiv](https://arxiv.org/abs/2211.16028)). |
|
|
| # License |
| See [the jaCappella dataset page](https://tomohikonakamura.github.io/jaCappella_corpus/). |
|
|
| # Citation |
| See [the jaCappella dataset page](https://tomohikonakamura.github.io/jaCappella_corpus/). |
|
|
| # Configuration |
| ```yaml |
| data: |
| num_workers: 12 |
| sample_rate: 48000 |
| samples_per_track: 13 |
| seed: 42 |
| seq_dur: 5.046 |
| source_augmentations: |
| - gain |
| sources: |
| - vocal_percussion |
| - bass |
| - alto |
| - tenor |
| - soprano |
| - lead_vocal |
| filterbank: |
| kernel_size: 32 |
| n_filters: 64 |
| stride: 16 |
| masknet: |
| bidirectional: true |
| chunk_size: 174 |
| dropout: 0 |
| ff_activation: relu |
| ff_hid: 256 |
| hop_size: 128 |
| in_chan: 64 |
| mask_act: sigmoid |
| n_repeats: 8 |
| n_src: 6 |
| norm_type: gLN |
| out_chan: 64 |
| optim: |
| lr: 0.005 |
| optimizer: adam |
| weight_decay: 1.0e-05 |
| training: |
| batch_size: 1 |
| early_stop: true |
| epochs: 600 |
| gradient_clipping: 5 |
| half_lr: true |
| loss_func: pit_sisdr |
| ``` |
| # Results (SI-SDR [dB]) on vocal ensemble separation |
|
|
| | Method | Lead vocal | Soprano | Alto | Tenor | Bass |Vocal percussion| |
| |:---------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:| |
| | DPTNet | 8.9 | 8.5 | 11.9 | 14.9 | 19.7 | 21.9 | |