Professor Jorge Trevino from Tohoku University visited the CAS Key Laboratory of Speech Acoustics and Content Understanding in IOA on August 9, invited by Professor LI Junfeng from IOA. Jorge gave an academic lecture about his research area, entitled “Spatial sound recording, analysis and reproduction” during his visit.
In Jorge’s lecture, he introduced his research background first. There was a strong push towards the ultra-realistic presentation of multimedia contents made possible by the latest advances in computational and signal processing technologies. Three-dimensional sound presentation was necessary to convey a natural and rich multimedia experience. Promising ways to achieve this included the sound field reproduction technique known as high-order Ambisonics. While these advanced methods were within the capabilities of consumer-level processing systems, their adoption was hindered by the lack of contents. Production and coding of the audio components in multimedia focused on traditional formats, such as stereophonic sound. Mainstream audio codecs and media such as CDs or DVDs did not support advanced, rich contents such as high-order Ambisonics encodings.
In the following, Jorge put forward a novel way to downmix high-order Ambisonics contents into a conventional stereo signal, to ameliorate this problem and speed up the adoption of spatial sound technologies.
And the resulting data could be distributed using conventional methods such as audio CDs or as the audio component of an internet video stream. The results were fully compatible with conventional stereo and could be listened by users with legacy reproduction systems. However, they included spatial information encoded as the inter-channel level and phase differences. The proposed method consisted on a downmixing filterbank which independently modulated inter-channel differences at each frequency bin. The proposal was evaluated using simple test signals and found to outperform conventional methods such as matrix-encoded surround and the Ambisonics UHJ format in terms of spatial resolution.
Jorge’s lecture was well received by attendees and some of the research methods were discussed after the lecture.
(Source: The CAS Key Laboratory of Speech Acoustics and Content Understanding)