In the real acoustic environment, the perceived quality and intelligibility of the enhanced speech signals are badly deteriorated by the non-stationary background noise, especially the wind noise when recording in the open air.
The wind noise is usually generated by the turbulent air flow around the user’s head, the recording devices or other obstacles, which may cause serious negative impacts on the sound quality for the voice communication.
Recently, researchers Bai Haichuan and Ge Fengpei from the Institute of Acoustics (IOA) of the Chinese Academy of Sciences proposed a new speech enhancement framework for wind noise reduction, to remove wind noise without speech distortion for voice communication.
The paper entitled “DNN-based Speech Enhancement Using Soft Audible Noise Masking for Wind Noise Reduction” was published online in China Communications.
The DNN model is powerful to estimate the wind noise and speech component, by using large amounts of data recorded in the specific environment. However, some residual wind noise remains in the low-frequency region below 3 kHz where the spectrum of wind noise overlaps with the voiced speech. Especially in the low signal-noise ratio condition, the residual noise is easily perceived by human ears and the auditory quality and intelligibility of the enhanced speech signals are deteriorated.
In order to suppress the audible residual wind noise, the researchers employed a psychoacoustic model to compute the masking threshold from the estimated speech spectrum, and adopted the soft audible noise masking principle into the spectral weighting algorithm by using the masking threshold and the estimated spectrum of wind noise. To deal with the rapidly time-varying signals, both of the speech and noise spectra were estimated based on deep neural networks.
This novel DNN-based speech enhancement framework for wind noise reduction was evaluated by objective and subjective means.
During the objective evaluation, the proposed and referential methods were tested with wind noise collected outdoors by a microphone without any wind screen on a windy day with wind speeds up to 15 m/s. The objective quality of the enhanced speech signals obtained higher scores than the referential Direct DNN method and can restore the speech spectrum more accurately.
In the subjective preference tests, signals enhanced by the proposed method were preferable over the signals enhanced by the Direct DNN method.
The objective and subjective evaluation results showed that the proposed wind noise reduction framework, which effectively suppressed the residual wind noise in the low-frequency region, improved the performance compared with the conventional DNN-based wind noise reduction methods.
Funding for this research came from the National Natural Science Foundation of China (No.11590772, 11590770).
Reference:
BAI Haichuan, GE Fengpei, YAN Yonghong. DNN-based Speech Enhancement Using Soft Audible Noise Masking for Wind Noise Reduction. China Communications (Volume 15 Issue 9, September 2018, Pages 235-243). DOI: 10.1109/CC.2018.8456465.
Contact:
WANG Rongquan
Institute of Acoustics, Chinese Academy of Sciences, 100190 Beijing, China
E-mail: media@mail.ioa.ac.cn