Research Progress on Chinese-English Bilingual Speech Recognition

 |  | 

 

In recent years, bilingual communication becomes a common phenomenon as a result of globalization. It presents a new challenge to the real world applications of speech recognition technology. The main difficulties to handle the bilingual speech recognition for real world application are focused on two aspects: the first is to balance the performance on inter- and intra- sentential language switching and to reduce the complexity of the bilingual speech recognition system; the second is to effectively deal with the matrix language accents in embedded language.

So in order to process the intra-sentential language switching and reduce the amount of data required to robustly estimate statistical models, ZHANG Qingqing, PAN Jielin and YAN Yonghong of ThinkIT Lab, Chinese Academy of Sciences conducted a series of studies and developed a compact single set of bilingual acoustic model derived by phone set merging and clustering, instead of using two separate monolingual models for each language.

In their study, a novel Two-pass phone clustering method based on Confusion Matrix (TCM) is presented and compared with the log-likelihood measure method. In order to deal with the nonnative speech recognition, a novel bilingual model modification approach is presented to improve nonnative speech recognition, considering these great variations of accented pronunciations. Experiments testify that with these proposed methods, the Chinese-English bilingual speech recognition system can handle the bilingual speech recognition effectively and efficiently.
Appendix: