作者: Ellen M. Eide , Robert E. Donovan
DOI:
关键词: Constant (mathematics) 、 Speech recognition 、 Speech synthesis 、 Duration (music) 、 Electroglottograph 、 Stress (linguistics) 、 Computer science 、 Block (data storage) 、 Signal 、 Natural (music)
摘要: A method for automatically generating pitch contours in a text to speech (TtS) system, the system converting input into an output acoustic signal simulating natural speech, comprising steps of: storing plurality of associated stress and level pairs, each pairs including lexical level; calculating levels text; comparing stored find closest copying with generate text. Features illustrative various modes invention include that correspond end vowels, use phonetic dictionary expand words phonemes concatenate levels, blocking sentences constant or variable lengths by segmenting from ends toward beginnings, averaging at block boundary. The may distinguish among declarations, questions, exclamations. Training be collected more than one speaker scaled; speaker(s) wear laryngograph provide vocal cord activity.