CTC alignments improve autoregressive translation

Brian Yan , Siddharth Dalmia , Yosuke Higuchi , Graham Neubig
arXiv preprint arXiv:2210.05200

3
2022
Joint modeling of code-switched and monolingual asr via conditional factorization

Brian Yan , Chunlei Zhang , Meng Yu , Shi-Xiong Zhang
Smpte Journal 6412 -6416

6
2022
Espnet-slu: Advancing spoken language understanding through espnet

Siddhant Arora , Siddharth Dalmia , Pavel Denisov , Xuankai Chang
Smpte Journal 7167 -7171

22
2022
ESPnet-SE++: Speech enhancement for robust speech recognition, translation, and understanding

Yen-Ju Lu , Xuankai Chang , Chenda Li , Wangyou Zhang
arXiv preprint arXiv:2207.09514

6
2022
Two-pass low latency end-to-end spoken language understanding

Siddhant Arora , Siddharth Dalmia , Xuankai Chang , Brian Yan
arXiv preprint arXiv:2207.06670

3
2022
Differentiable allophone graphs for language-universal speech recognition

Brian Yan , Siddharth Dalmia , David R Mortensen , Florian Metze
arXiv preprint arXiv:2107.11628

7
2021
CMU’s IWSLT 2022 dialect speech translation system

Brian Yan , Patrick Fernandes , Siddharth Dalmia , Jiatong Shi
Smpte Journal 298 -307

7
2022
Reproducing whisper-style training using an open-source toolkit and publicly available data

Yifan Peng , Jinchuan Tian , Brian Yan , Dan Berrebbi
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 1 -8

12
2023
Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study

Xuankai Chang , Brian Yan , Kwanghee Choi , Jee-Weon Jung
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 11481 -11485

7
2024
Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization

Amir Hussein , Brian Yan , Antonios Anastasopoulos , Shinji Watanabe
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 11971 -11975

2024
Speech collage: code-switched audio generation by collaging monolingual corpora

Amir Hussein , Dorsa Zeinali , Ondřej Klejch , Matthew Wiesner
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 12006 -12010

2024
Bayes risk ctc: Controllable ctc alignment in sequence-to-sequence tasks

Jinchuan Tian , Brian Yan , Jianwei Yu , Chao Weng
arXiv preprint arXiv:2210.07499

7
2022
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction

Jinchuan Tian , Jianwei Yu , Hangting Chen , Brian Yan
arXiv preprint arXiv:2308.10107

2023
A comparative study on E-branchformer vs conformer in speech recognition, translation, and understanding tasks

Yifan Peng , Kwangyoun Kim , Felix Wu , Brian Yan
arXiv preprint arXiv:2305.11073

9
2023
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

Brian Yan , Jiatong Shi , Yun Tang , Hirofumi Inaguma
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

5
2023
Searchable hidden intermediates for end-to-end models of decomposable sequence tasks

Siddharth Dalmia , Brian Yan , Vikas Raunak , Florian Metze
arXiv preprint arXiv:2105.00573

30
2021
Improving massively multilingual ASR with auxiliary CTC objectives

William Chen , Brian Yan , Jiatong Shi , Yifan Peng
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 1 -5

23
2023
BERT meets CTC: New formulation of end-to-end speech recognition with pre-trained masked language model

Yosuke Higuchi , Brian Yan , Siddhant Arora , Tetsuji Ogawa
arXiv preprint arXiv:2210.16663

21
2022
Prompting the hidden talent of web-scale speech models for zero-shot task generalization

Puyuan Peng , Brian Yan , Shinji Watanabe , David Harwath
arXiv preprint arXiv:2305.11095

20
2023