Network Compression Using Specialists

For the paper, “Sparse Mixture of Local Experts for Efficient Speech Enhancement,” submitted to Interspeech 2020.

The point here is that our proposed specialist network, although it’s less complex (512 X 2), can compete with a large generalist (1024 X 3).

Example #1

Ground-truth speech source
Recovered speech from
the proposed specialist (512X2)
12.86 dB SiSDR
Noise source
Recovered speech from
the small generalist (512X2)
11.31 dB SiSDR
Mixture (-5.33 dB SiSDR)
Recovered speech from
the large generalist (1024X3)
12.99 SiSDR

Example #2

Ground-truth speech source
Recovered speech from
the proposed specialist (512X2)
11.37 dB SiSDR
Noise source
Recovered speech from
the small generalist (512X2)
10.21 dB SiSDR
Mixture (-4.36 dB SiSDR)
Recovered speech from
the large generalist (1024X3)
11.13 SiSDR

Source code

https://github.com/IU-SAIGE/sparse_mle