作者: F. Alias , X. Sevillano , J.C. Socoro , X. Gonzalvo
关键词: Speech processing 、 Natural language processing 、 Word processing 、 Natural language 、 Speech recognition 、 Speech synthesis 、 Domain (software engineering) 、 Artificial intelligence 、 Context (language use) 、 Computer science 、 Text processing 、 Field (computer science)
摘要: This paper is a contribution to the recent advancements in development of high-quality next generation text-to-speech (TTS) synthesis systems. Two hottest research topics this area are oriented towards improvement speech expressiveness and flexibility synthesis. In context, presents new TTS strategy called multidomain (MD-TTS) for synthesizing among different domains. Although philosophy has been widely applied spoken language systems, few efforts have conducted extend it field. To do so, several proposals described paper. First, text classifier (TC) included classic architecture order automatically conduct selection most appropriate domain input text. contrast topic classification tasks, MD-TTS TC should not only consider contents but also its structure. end, introduces modeling scheme based on an associative relational network, which represents texts as directional weighted word-based graph. The experiments validate proposal terms both objective (TC efficiency) subjective (perceived synthetic quality) evaluation criteria.