System and method for training cloned tone and rhythm based on Bottleneck features

作者: Sima Huapeng , Gong Xuefei

DOI:

关键词: Speech recognitionTraining systemScheme (programming language)Field (computer science)Speech synthesisComputer scienceTransfer of learningBottleneckTone (musical instrument)Feature extraction

摘要: The invention relates to the technical field of voice synthesis, recognition and cloning, provides a cloning implementation scheme based on Bottleneck features (language featuresof audio) by combining synthesis technology, technology transfer learning technology. A training system method are included. TTS service with highnaturalness similarity is provided using small number samples, so that target user characteristics provided, problems large sample size, long manufacturing period high labor cost solved. comprises data acquisition module, an acoustic feature extraction rhythm multi-person module module. further system. steps oftraining corpus preparation, extraction, fine adjustment all modules speech synthesis.

参考文章(11)
Masahiro Morita, Takehiko Kagoshima, Speech synthesis apparatus and method ,(2007)
Crystal Annette Nakatsu, Jessica M Christian, Ángel Rodriguez, Pilar Amores, Amores Carredano José Gabriel De, Robert James Firby, Den Berg Martin Henk Van, Nonlinguistic input for natural language generation ,(2015)
Chen Mengzhe, Zhang Qingqing, Yan Yonghong, Pan Jielin, Neural network acoustic model training method ,(2017)
Rana el Kaliouby, George Alexander Reichenbach, Taniya Mishra, Avatar image animation using translation vectors ,(2018)
Li Xuehui, Chen Zhuo, Rong Bojie, Chen Xi, Lan Zhijian, Yu Chunxia, Method for identifying sound fault based on mel energy spectrum and convolution neural network ,(2019)
Wan Li, Pan Chenghua, Li Canhong, Noisy speech recognition method based on transfer learning ,(2019)