The ATR Multilingual Speech-to-Speech Translation System

作者: S. Nakamura , K. Markov , H. Nakaiwa , G. Kikui , H. Kawai

DOI: 10.1109/TSA.2005.860774

关键词:

摘要: In this paper, we describe the ATR multilingual speech-to-speech translation (S2ST) system, which is mainly focused on between English and Asian languages (Japanese Chinese). There are three main modules of our S2ST system: large-vocabulary continuous speech recognition, machine text-to-text (T2T) translation, text-to-speech synthesis. All them designed using state-of-the-art technologies developed at ATR. A corpus-based statistical learning framework forms basis system design. We use a parallel database consisting over 600 000 sentences that cover broad range travel-related conversations. Recent evaluation overall showed quality high, being level person having Test for International Communication (TOEIC) score 750 out perfect 990.

参考文章(44)
Kazuo Hakoda, Tomohisa Hirokawa, Segment selection and pitch modification for high quality speech synthesis using waveform segments. conference of the international speech communication association. ,(1990)
Yoshinori Sagisaka, Toshiyuki Takezawa, Fumiaki Sugaya, Seiichi Yamamoto, Akio Yokoo, Evaluation of the ATR-MATRIX speech translation system with a pair comparison method between the system and humans conference of the international speech communication association. pp. 1105- 1108 ,(2000)
Lori Lamel, Jean-Luc Gauvain, Fabrice Lefèvre, Improving genericity for task-independent speech recognition. conference of the international speech communication association. pp. 1241- 1244 ,(2001)
Eiichiro Sumita, Toshiyuki Takezawa, Gen-ichiro Kikui, Seiichi Yamamoto, Creating corpora for speech-to-speech translation. conference of the international speech communication association. ,(2003)
Eiichiro Sumita, Toshiyuki Takezawa, Fumiaki Sugaya, Hirofumi Yamamoto, Seiichi Yamamoto, Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversations in the Real World language resources and evaluation. ,(2002)
Yoshinori Sagisaka, Nobuyoshi Kaiki, Naoto Iwahashi, Speech Segment Selection for Concatenative Synthesis Based on Spectral Distortion Minimization IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences. ,vol. 76, pp. 1942- 1948 ,(1993)
Makoto Nagao, A framework of a mechanical translation between Japanese and English by analogy principle Proc. of the international NATO symposium on Artificial and human intelligence. pp. 173- 180 ,(1984)
Christoph Tillmann, Franz Josef Och, Hermann Ney, Improved Alignment Models for Statistical Machine Translation empirical methods in natural language processing. ,(1999)
Wolfgang Wahlster, None, Verbmobil : foundations of speech-to-speech translation Springer Berlin Heidelberg. ,(2000) , 10.1007/978-3-662-04230-4