Jiatong Shi (史嘉彤)

机构: Carnegie Mellon University

主页: shijt.site

每年引用次数

引用次数

引用: 1,921

H-指数: 18

I10-指数 : 26

出版物: 69

标题

引用次数

年份

SUPERB: Speech processing Universal PERformance Benchmark.

Shang-Wen Li , Shinji Watanabe , Hung-yi Lee , Xuankai Chang
arXiv: Computation and Language

279

2021

Sequence-To-Sequence Singing Voice Synthesis With Perceptual Entropy Loss

Nan Huo , Qin Jin , Jiatong Shi , Shuai Guo
international conference on acoustics speech and signal processing

2021

Improving RNN Transducer with Target Speaker Extraction and Neural Uncertainty Estimation

Dong Yu , Chao Weng , Meng Yu , Shinji Watanabe
international conference on acoustics speech and signal processing

2021

Recent Developments on Espnet Toolkit Boosted By Conformer

Daniel Garcia-Romero , Shinji Watanabe , Tomoki Hayashi , Hirofumi Inaguma
international conference on acoustics speech and signal processing

159

2021

Large-Scale End-to-End Multilingual Speech Recognition and Language Identification with Multi-Task Learning.

Wenxin Hou , Yue Dong , Bairong Zhuang , Longfei Yang
conference of the international speech communication association 1037 -1041

2020

Leveraging deep learning with audio analytics to predict the success of crowdfunding projects

Jiatong Shi , Kunlin Yang , Wei Xu , Mingming Wang
The Journal of Supercomputing 1 -21

2021

Espnet2-tts: Extending the edge of tts research

Tomoki Hayashi , Ryuichi Yamamoto , Takenori Yoshimura , Peter Wu
arXiv preprint arXiv:2110.07840

2021

SUPERB-SG: Enhanced speech processing universal performance benchmark for semantic and generative capabilities

Hsiang-Sheng Tsai , Heng-Jui Chang , Wen-Chin Huang , Zili Huang
arXiv preprint arXiv:2203.06849

2022

Muskits: an end-to-end music processing toolkit for singing voice synthesis

Jiatong Shi , Shuai Guo , Tao Qian , Nan Huo
arXiv preprint arXiv:2205.04029

2022

SingAug: Data augmentation for singing voice synthesis with cycle-consistent training strategy

Shuai Guo , Jiatong Shi , Tao Qian , Shinji Watanabe
arXiv preprint arXiv:2203.17001

2022

Cross-Lingual Transfer for Speech Processing Using Acoustic Language Similarity

Peter Wu , Jiatong Shi , Yifan Zhong , Shinji Watanabe
Smpte Journal 1050 -1057

2021

CMU’s IWSLT 2022 dialect speech translation system

Brian Yan , Patrick Fernandes , Siddharth Dalmia , Jiatong Shi
Smpte Journal 298 -307

2022

Reproducing whisper-style training using an open-source toolkit and publicly available data

Yifan Peng , Jinchuan Tian , Brian Yan , Dan Berrebbi
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 1 -8

2023

Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study

Xuankai Chang , Brian Yan , Kwanghee Choi , Jee-Weon Jung
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 11481 -11485

2024

Dynamic-superb: Towards a dynamic, collaborative, and comprehensive instruction-tuning benchmark for speech

Chien-yu Huang , Ke-Han Lu , Shih-Heng Wang , Chi-Yuan Hsiao
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 12136 -12140

2024

EURO: ESPnet unsupervised asr open-source toolkit

Dongji Gao , Jiatong Shi , Shun-Po Chuang , Leibny Paola Garcia
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 1 -5

2023

SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan

You Zhang , Yongyi Zang , Jiatong Shi , Ryuichi Yamamoto
arXiv preprint arXiv:2405.05244

2024

ProsodyBERT: Self-supervised prosody representation for style-controllable TTS

Yushi Hu , Chunlei Zhang , Jiatong Shi , Jiachen Lian

2022

Towards end-to-end speaker diarization with generalized neural speaker clustering

Chunlei Zhang , Jiatong Shi , Chao Weng , Meng Yu
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 8372 -8376

2022

An investigation of neural uncertainty estimation for target speaker extraction equipped RNN transducer

Jiatong Shi , Chunlei Zhang , Chao Weng , Shinji Watanabe
Computer Speech & Language 73 101327 -101327

2022