作者: Nobuaki Minematsu , Keikichi Hirose , Daisuke Saito , Keisuke Yamamoto
DOI:
关键词:
摘要: This paper describes a novel approach to flexible control of speaker characteristics using tensor representation space. In voice conversion studies, realization from/to an arbitrary speaker’s is one the important objectives. For this purpose, eigenvoice (EVC) based on Gaussian mixture model (EV-GMM) was proposed. EVC, similarly recognition approaches, space constructed GMM supervectors which are high-dimensional vectors derived by concatenating mean each GMMs. space, represented small number weight parameters eigen-supervectors. paper, we revisit construction introducing analysis training data set. our approach, as matrix row and column respectively correspond component dimension vector, set matrices. Our can solve inherent problem supervector representation, it improves performance conversion. Experimental results oneto-many demonstrate effectiveness proposed approach. Index Terms: conversion, model, eigenvoice, analysis, Tucker decomposition