作者: Ron Hoory , Itzhack Goldberg , Boaz Mizrachi , Zvi Kons
DOI:
关键词: Speech recognition 、 Expression (mathematics) 、 Computer science 、 Text to speech synthesis 、 Speaker diarisation
摘要: A method and system are provided for text-to-speech synthesis with personalized voice. The includes receiving an incidental audio input ( 403 ) of speech in the form communication from speaker 401 generating a voice dataset 404 ). text 411 at same device as synthesizing 312 to synthesized including using personalize sound like In addition, analyzing 316 expression adding 315 speech. may be part video 453 have associated visual 455 image speaker. include providing look expressions added