In single-channel speech enhancement systems, it is well-known that there are two open problems for spectral subtraction. One is how to estimate the noise power spectral density (NPSD) in adverse environments. And the other one is how to suppress the non-stationary noise components effectively even when the NPSD is severely underestimated. As a result, researchers have made great efforts to solve these two problems during the last four decades.
Researchers HU Xiaohu, WANG Shiwei, ZHENG Chengshi and LI Xiaodong from the Communication Acoustics Laboratory, Institute of Acoustics, Chinese Academy of Sciences propose a new scheme to improve the tracking capability of the existing NPSD estimation methods. This scheme is based on the fact that the voiced speech often lasts a long time.
A cepstrum-based preprocessing and postprocessing algorithm for single-channel speech enhancement is proposed. The cepstrum-based preprocessing scheme is applied to reduce the impact of the voiced speech on estimating the NPSD. And it results in avoiding overestimating the NPSD by eliminating harmonic components of the voiced speech, when tracking non-stationary noise components. The cepstrum-based postprocessing scheme is used to suppress both some non-stationary noise components and the annoying musical noise without introducing audible speech distortion.
Experimental results show that the proposed algorithm could track non-stationary noise effectively without overestimating the NPSD. Moreover, the proposed algorithm achieves better performance in terms of both the segmental signal-to-noise-ratio improvement and the PESQ improvement.
This research was supported in part by National Science Fund of China Under Grand Nos. 61072123, 61201403. This research was also supported by the development of service architecture of convergent networks under No. 2011AA01A102 and the tri-networks integration under No. KGZD-EW-103-5(3).
The research entitled “A Cepstrum-based Preprocessing and Postprocessing for Speech Enhancement in Adverse Environments” will be released in Applied Acoustics (Vol.74, No. 12, December 2013, pp. 1458–1462).