Probabilistic Integration of Joint Density Model and Speaker Model for Voice Conversion

作者： Nobuaki Minematsu , Daisuke Saito , Daisuke Saito , Shinji Watanabe , Atsushi Nakamura

DOI:

关键词:

摘要: This paper describes a novel approach to voice conversion using both joint density model and speaker model. In studies, approaches based on Gaussian Mixture Model (GMM) with probabilistic densities of vectors source target speakers are widely used estimate transformation. However, for sufficient quality, they require parallel corpus which contains plenty utterances the same linguistic content spoken by speakers. addition, GMM methods often suffer from over-training effects when amount training data is small. To compensate these problems, we propose integrate formulation. The proposed method trains few utterances, non-parallel target, independently. It eases burden speaker. Experiments demonstrate effectiveness method, especially Index Terms: conversion, model, unification

uni-trier.de 本地加速

elsevier.com 本地加速

isca-speech.org 本地加速

u-tokyo.ac.jp PDF 下载加速

参考文章(13)

Aki Kunikoshi, Nobuaki Minematsu, Keikichi Hirose, Yu Qiao, Speech Generation from Hand Gestures Based on Space Mapping conference of the international speech communication association. pp. 308- 311 ,(2009)

Chung-Hsien Wu, Chung-Han Lee, Map-based adaptation for speech conversion using adaptation data selection and non-parallel training. conference of the international speech communication association. ,(2006)

Tomoki Toda, Yamato Ohtani, Kiyohiro Shikano, Eigenvoice Conversion Based on Gaussian Mixture Model conference of the international speech communication association. ,(2006)

Li Deng, A. Acero, Li Jiang, J. Droppo, Xuedong Huang, High-performance robust speech recognition using stereo training data international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 301- 304 ,(2001) , 10.1109/ICASSP.2001.940827

Akira Kurematsu, Kazuya Takeda, Yoshinori Sagisaka, Shigeru Katagiri, Hisao Kuwabara, Kiyohiro Shikano, ATR Japanese speech database as a tool of speech recognition and synthesis Speech Communication. ,vol. 9, pp. 357- 363 ,(1990) , 10.1016/0167-6393(90)90011-W

Douglas A. Reynolds, Thomas F. Quatieri, Robert B. Dunn, Speaker Verification Using Adapted Gaussian Mixture Models Digital Signal Processing. ,vol. 10, pp. 19- 41 ,(2000) , 10.1006/DSPR.1999.0361

Alain de Cheveigné, Hideki Kawahara, Ikuyo Masuda-Katsuse, Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds Speech Communication. ,vol. 27, pp. 187- 207 ,(1999) , 10.1016/S0167-6393(98)00085-5

M. Abe, S. Nakamura, K. Shikano, H. Kuwabara, Voice conversion through vector quantization international conference on acoustics speech and signal processing. pp. 655- 658 ,(1988) , 10.1109/ICASSP.1988.196671

Tomoki Toda, Alan W. Black, Keiichi Tokuda, Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 15, pp. 2222- 2235 ,(2007) , 10.1109/TASL.2007.907344

10.

A. Kain, M.W. Macon, Spectral voice conversion for text-to-speech synthesis international conference on acoustics speech and signal processing. ,vol. 1, pp. 285- 288 ,(1998) , 10.1109/ICASSP.1998.674423

Probabilistic Integration of Joint Density Model and Speaker Model for Voice Conversion

来源期刊

我的账户

Probabilistic Integration of Joint Density Model and Speaker Model for Voice Conversion

来源期刊

相似文章 7

Statistical Voice Conversion Based on Noisy Channel Model

Voice conversion using RNN pre-trained by recurrent temporal restricted boltzmann machines

Voice conversion using speaker-dependent conditional restricted Boltzmann machine

Voice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines

High accurate model-integration-based voice conversion using dynamic features and model structure optimization

Gesture Design of Hand-to-Speech Converter Derived from Speech-to-Hand Converter Based on Probabilistic Integration Model

PAPER Special Section on Advances in Modeling for Real-world Speech Information Processing and its Application Voice Conversion Based on Speaker-Dependent Restricted

我的账户