Chinese speech recognition system and method

作者: Sin-Horng Chen , Yih-Ru Wang , Chen-Yu Chiang , Yuan-Fu Liao , Ming-Chieh Liu

DOI:

关键词: Word (computer architecture)Artificial intelligenceState modelStructure (mathematical logic)Speech recognitionSegmentationFactored language modelNatural language processingComputer scienceSyllableSIGNAL (programming language)Speech synthesis

摘要: A Chinese speech recognition system and method is disclosed. Firstly, a signal received recognized to output word lattice. Next, the lattice received, arcs of are rescored reranked with prosodic break model, state syllable prosodic-acoustic syllable-juncture model factored language so as tag, tag phonetic segmentation which correspond signal. The present invention performs rescoring in two-stage way promote rate basic information labels provide structure for rear-stage voice conversion synthesis.

参考文章(3)
Mari Ostendorf, Rebecca Bates, Izhak Shafran, PROSODY MODELS FOR CONVERSATIONAL SPEECH RECOGNITION ,(2003)
Sankaranarayanan Ananthakrishnan, Shrikanth Narayanan, Improved Speech Recognition using Acoustic and Lexical Correlates of Pitch Accent in a N-Best Rescoring Framework international conference on acoustics, speech, and signal processing. ,vol. 4, pp. 873- 876 ,(2007) , 10.1109/ICASSP.2007.367209
Chen-Yu Chiang, Sin-Horng Chen, Hsiu-Min Yu, Yih-Ru Wang, Unsupervised joint prosody labeling and modeling for Mandarin speech. Journal of the Acoustical Society of America. ,vol. 125, pp. 1164- 1183 ,(2009) , 10.1121/1.3056559