Invited by the CAS Key Laboratory of Speech Acoustics and Content Understanding, Dr. MA Ning from the University of Sheffield, UK visited IOA on September 30th and gave an academic report.
The report was titled as “Exploiting top-down source models to improve binaural localization of multiple sources in reverberant environments”.
It was about recent progress in multi-source reverberant binaural sound localization made by Dr. MA Ning and his group. Most current binaural sound source localization approaches utilized the bottom-up structure, i.e., extracting the time and the level differences from the binaural signals as the cues to estimate the sound source positions. However, many biological researches about human hearing indicated that, in addition to the preceding two low-level features, human hearing also utilized some kinds of top-down manner to localize sound sources, i.e., human could actively select the interested sources to pay attention to.
Inspired by the selective attention mechanism of human hearing, this report introduced an algorithm framework which merged the bottom-up and top-down cues for binaural sound source localization. The basic idea of the proposed framework was model pre-training via the prior target, interference, and noise sources, and then, weighting the low-level features according to the trained model to achieve guided sound source localization. Experimental result indicated that, since the top-down cues were added in, the localization accuracy was significantly improved in reverberant and multi-source environments.
The report attracted wide interest of the attendees. Positive discussions and experience exchanges were extensively made between Dr. MA Ning and the attendees.
(Source: NA Yueyue from the CAS Key Laboratory of Speech Acoustics and Content Understanding)