SHAO Yanqiu HAN Jiqing ZHAO Yongzhen LIU Ting
(School of Computer Science and Technology, Harbin Institute of Technology Harbin 150001)
Received Feb. 21, 2006
Revised Apr. 5, 2006
Abstract Stress is an important parameter for prosody processing in speech synthesis. In this paper, we compare the acoustic features of neutral tone syllables and strong stress syllables with moderate stress syllables, including pitch, syllable duration, intensity and pause length after syllable. The relation between duration and pitch, as well as the Third Tone (T3) and pitch are also studied. Three stress prediction models based on ANN, i.e. the acoustic model, the linguistic model and the mixed model, are presented for predicting Chinese sentential stress. The results show that the mixed model performs better than the other two methods. In order to solve the problem of the diversity of manual labeling, an evaluation index of support ratio is proposed.
PACS number: 43.70
|