作者: Hironori Doi , Keigo Nakamura , Tomoki Toda , Hiroshi Saruwatari , Kiyohiro Shikano
DOI: 10.1109/ICASSP.2010.5495676
关键词:
摘要: This paper presents a novel method of enhancing esophageal speech using statistical voice conversion. Esophageal is one the alternative speaking methods for laryngectomees. Although it doesn't require any external devices, generated voices sound unnatural. To improve intelligibility and naturalness speech, we propose conversion from into normal speech. A spectral parameter excitation parameters target are separately estimated based on Gaussian mixture models. The experimental results demonstrate that proposed yields significant improvements in naturalness. We also apply one-to-many eigenvoice to enhancement flexibly controlling enhanced quality.