Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction

作者: A. Kain , M.W. Macon

DOI: 10.1109/ICASSP.2001.941039

关键词: Speaker diarisationLinear predictive codingAlgorithmSpeech recognitionComputer scienceSpectral envelopeSpeech processingVoice analysisSpeech synthesisSpeech codingCodec2Voice activity detectionSpeaker recognitionActive listening

摘要: … In Section 5, we propose a novel way of constructing a converted utterance by mapping parameters of a spectral envelope and then predicting the residual from it. This is in contrast to …

参考文章(10)
Olivier Cappé, Yannis Stylianou, Eric Moulines, Statistical methods for voice quality transformation conference of the international speech communication association. pp. 447- 450 ,(1995)
John-Paul Hosom, Ronald A. Cole, Automatic time alignment of phonemes using acoustic-phonetic information PhDT. pp. 2035- ,(2000)
Ki Seung Lee, Dae Hee Youn, Il Whan Cha, A new voice transformation method based on both linear and nonlinear prediction analysis international conference on spoken language processing. ,vol. 3, pp. 1401- 1404 ,(1996) , 10.1109/ICSLP.1996.607876
Jody Kreiman, George Papcun, Comparing discrimination and recognition of unfamiliar voices Speech Communication. ,vol. 10, pp. 265- 275 ,(1991) , 10.1016/0167-6393(91)90016-M
D.G. Childers, Glottal source modeling for voice conversion Speech Communication. ,vol. 16, pp. 127- 138 ,(1995) , 10.1016/0167-6393(94)00050-K
Levent M. Arslan, Speaker transformation algorithm using segmental codebooks (STASC) Speech Communication. ,vol. 28, pp. 211- 226 ,(1999) , 10.1016/S0167-6393(99)00015-1
A. Schmidt‐Nielsen, Karen R. Stern, Recognition of previously unfamiliar speakers as a function of narrow‐band processing and speaker selection Journal of the Acoustical Society of America. ,vol. 79, pp. 1174- 1177 ,(1986) , 10.1121/1.393392
M. Abe, S. Nakamura, K. Shikano, H. Kuwabara, Voice conversion through vector quantization international conference on acoustics speech and signal processing. pp. 655- 658 ,(1988) , 10.1109/ICASSP.1988.196671
A. Schmidt-Nielsen, D.P. Brock, Speaker recognizability testing for voice coders international conference on acoustics speech and signal processing. ,vol. 2, pp. 1149- 1156 ,(1996) , 10.1109/ICASSP.1996.543568
A. Kain, M.W. Macon, Spectral voice conversion for text-to-speech synthesis international conference on acoustics speech and signal processing. ,vol. 1, pp. 285- 288 ,(1998) , 10.1109/ICASSP.1998.674423