作者: Tan Lee , Mei-Yuh Hwang , Xin Lei , Mari Ostendorf , Man-Hung Siu
DOI:
关键词:
摘要: Tone has a crucial role in Mandarin speech distinguishing ambiguous words. Most state-of-the-art automatic recognition systems adopt embedded tone modeling, where tonal acoustic units are used and F0 features appended to the spectral feature vector. In this paper, we combine aproach (using improved smoothing) with explicit modeling rescoring output lattices. Oracle experiments indicate 32% relative improvement can be achieved by perfect information. Recognition on broadcast news show that, even an accuracy of only 70%, classifier offers complementary knowledge improves performance significantly. Through combination techniques, character error rate CTV test set from 13.0% 11.5%.