Multilingual Speech-to-Speech Translation System: VoiceTra

作者： Shigeki Matsuda , Xinhui Hu , Yoshinori Shiga , Hideki Kashioka , Chiori Hori

DOI: 10.1109/MDM.2013.99

关键词: Computer science 、 Speech technology 、 Language translation 、 Artificial intelligence 、 Speech recognition 、 Speech synthesis 、 Natural language processing 、 Chinese speech synthesis 、 Language model 、 VoxForge 、 Speech corpus 、 Speech analytics

摘要: This study presents an overview of VoiceTra, which was developed by NICT and released as the world's first network-based multilingual speech-to-speech translation system for smartphones, describes in detail its speech recognition, translation, synthesis regards to field experiments. We show effects updates using data collected from experiments improve our acoustic language models.

ieee.org 本地加速

uni-trier.de 本地加速

doi.org 本地加速

researchgate.net PDF 下载加速

uni-trier.de PDF 下载加速

sci-hub.se PDF 下载加速

参考文章(15)

Toshiyuki Takezawa, Gen-ichiro Kikui, Seiichi Yamamoto, Multilingual corpora for speech-to-speech translation research. conference of the international speech communication association. ,(2004)

Frank K Soong, Wai-Kit Lo, Satoshi Nakamura, Generalized word posterior probability (GWPP) for measuring reliability of recognized words Proc. SWIM 2004. ,(2004)

Takatoshi Jitsuhiro, Tomoko Matsui, Satoshi Nakamura, Automatic Generation of Non-uniform HMM Topologies Based on the MDL Criterion IEICE Transactions on Information and Systems. ,vol. 87, pp. 2121- 2129 ,(2004)

Wolfgang Wahlster, None, Verbmobil : foundations of speech-to-speech translation Springer Berlin Heidelberg. ,(2000) , 10.1007/978-3-662-04230-4

Takashi Masuko, Keiichi Tokuda, Takao Kobayashi, Tadashi Kitamura, Takayoshi Yoshimura, Simultaneous Modeling of Spectrum, Pitch and Duration in HMM-Based Speech Synthesis conference of the international speech communication association. pp. 2347- 2350 ,(1999)

H. Kawahara, Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited international conference on acoustics, speech, and signal processing. ,vol. 2, pp. 1303- 1306 ,(1997) , 10.1109/ICASSP.1997.596185

Masakiyo Fujimoto, Satoshi Nakamura, A Non-stationary Noise Suppression Method Based on Particle Filtering and Polyak Averaging The IEICE transactions on information and systems. ,vol. 89, pp. 922- 930 ,(2006) , 10.1093/IETISY/E89-D.3.922

Hirofumi Yamamoto, Shuntaro Isogai, Yoshinori Sagisaka, Multi-class composite N-gram language model Speech Communication. ,vol. 41, pp. 369- 379 ,(2003) , 10.1016/S0167-6393(02)00179-6

S.E. Levinson, Continuously variable duration hidden Markov models for automatic speech recognition Computer Speech & Language. ,vol. 1, pp. 29- 45 ,(1986) , 10.1016/S0885-2308(86)80009-2

10.

J.-L. Gauvain, Chin-Hui Lee, Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains IEEE Transactions on Speech and Audio Processing. ,vol. 2, pp. 291- 298 ,(1994) , 10.1109/89.279278

Multilingual Speech-to-Speech Translation System: VoiceTra

来源期刊

我的账户

Multilingual Speech-to-Speech Translation System: VoiceTra

来源期刊

相似文章 10

我的账户