Title:
Improvement of joint optimization of masks and deep recurrent neural
networks for monaural speech separation using optimized activation
functions
Author(s):
MASOOD Asim; YE Zhongfu;
Affiliation(s):
National Engineering Laboratory for Speech and Language Information
Processing, University of Science and Technology of China
Abstract:
Single channel speech separation was a challenging task for speech
separation community for last three decades. It is now possible to
separate speeches using deep neural networks (DNN) and deep recurrent
neural networks (DRNN) due to deep learning. Researchers are now
trying to improve different models of DNN and DRNN for monaural
speech separation. In this paper, we have tried to improve existing
DRNN and DNN based model for speech separation by using optimized
activation functions. Instead of using rectified linear unit (RELU),
we have implemented leaky RELU, exponential linear unit, exponential
function, inverse square root linear unit and inverse cubic root
linear unit (ICRLU) as activation functions. ICRLU and exponential
function are new activation functions proposed in this research work.
These activation functions have overcome the dying RELU problem. They
have achieved better separation results in comparison with RELU
function and they have also reduced the computational cost of DNN and
DRNN based monaural speech separation.
|