Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus

作者: Luisa Bentivogli* , Beatrice Savoldi* , Matteo Negri , Mattia Antonino Di Gangi , Roldano Cattoni

DOI: 10.18653/V1/2020.ACL-MAIN.619

关键词: Benchmark (computing)Natural language processingAudio signalComputer scienceSentenceGender identityArtificial intelligenceSpeech translationGrammatical genderMachine translationNatural language

摘要: Translating from languages without productive grammatical gender like English into gender-marked is a well-known difficulty for machines. This also due to the fact that training data on which models are built typically reflect asymmetries of natural languages, bias included. Exclusively fed with textual data, machine translation intrinsically constrained by input sentence does not always contain clues about identity referred human entities. But what happens speech translation, where an audio signal? Can provide additional information reduce bias? We present first thorough investigation in contributing with: i) release benchmark useful future studies, and ii) comparison different technologies (cascade end-to-end) two language directions (English-Italian/French).

参考文章(40)
Chris Callison-Burch, Miles Osborne, Philipp Koehn, Re-evaluating the Role of Bleu in Machine Translation Research conference of the european chapter of the association for computational linguistics. pp. 249- 256 ,(2006)
Vassil Panayotov, Guoguo Chen, Daniel Povey, Sanjeev Khudanpur, Librispeech: An ASR corpus based on public domain audio books international conference on acoustics, speech, and signal processing. pp. 5206- 5210 ,(2015) , 10.1109/ICASSP.2015.7178964
Charles Francis Hockett, A Course in Modern Linguistics ,(1958)
Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu, BLEU Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02. pp. 311- 318 ,(2001) , 10.3115/1073083.1073135
Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition Proceedings of the IEEE. ,vol. 86, pp. 2278- 2324 ,(1998) , 10.1109/5.726791
Chiori Hori, Matthias Eck, Overview of the IWSLT 2005 Evaluation Campaign IWSLT. pp. 1- 22 ,(2005)
S. Davis, P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 28, pp. 65- 74 ,(1980) , 10.1109/TASSP.1980.1163420
Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, John Makhoul, A Study of Translation Edit Rate with Targeted Human Annotation conference of the association for machine translation in the americas. pp. 223- 231 ,(2006)
Daniel Povey, Vijayaditya Peddinti, Daniel Galvez, Pegah Ghahremani, Vimal Manohar, Xingyu Na, Yiming Wang, Sanjeev Khudanpur, Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI. conference of the international speech communication association. pp. 2751- 2755 ,(2016) , 10.21437/INTERSPEECH.2016-595
Christophe Servan, Olivier Pietquin, Alexandre Bérard, Laurent Besacier, Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation neural information processing systems. ,(2016)