作者: Antonio Bonafonte , Alexander Kain , Jan P. H. van Santen , Helenca Duxans
DOI:
关键词:
摘要: Voice Conversion (VC) systems modify a speaker voice (source speaker) to be perceived as if another (target had uttered it. Previous published VC approaches using Gaussian Mixture Models [1] performs the conversion in frame-by-frame basis only spectral information. In this paper, two new are studied order extend GMM-based systems. First, dynamic information is used build acoustic model. So, transformation carried out according sequences of frames. Then, phonetic introduced training system. Objective and perceptual results compare performance proposed