作者: Yun Lei , Mitchell McLaren , Luciana Ferrer , Nicolas Scheffer
DOI: 10.1109/ICASSP.2014.6854360
关键词: I vector 、 Noise (video) 、 Normalization (statistics) 、 Computer science 、 Speaker recognition 、 Pattern recognition 、 Contrast (statistics) 、 Speech recognition 、 Scale (ratio) 、 NIST 、 Scheme (programming language) 、 Artificial intelligence
摘要: A vector taylor series (VTS) based i-vector extractor was recently proposed for noise-robust speaker recognition by extracting synthesized clean i-vectors to be used in the standard system back-end. This approach brings significant improvements accuracy noisy speech conditions. However, this incurred such a large computational expense that using state-of-the-art model size or evaluating scale evaluations impractical. In work, we propose an efficient simplification scheme, named sVTS, order show VTS gives applications compared systems. contrast VTS, sVTS generates normalized Baum-Welch statistics and uses model, making it straightforward employ on system. Results presented both PRISM NIST SRE'12 corpora provides conditions, our result only slight degradation with respect original approach.