作者: Ahmed Krobba , Mohamed Debyeche , Sid-Ahmed Selouani
DOI: 10.1007/S11042-020-08748-2
关键词: Speech recognition 、 Word error rate 、 Cepstrum 、 Gaussian 、 Linear prediction 、 Additive white Gaussian noise 、 Noise 、 Microphone 、 Mel-frequency cepstrum 、 Rayleigh fading 、 Computer science
摘要: In this paper, we present a Mixture Linear Prediction based approach for robust Gammatone Cepstral Coefficients extraction (MLPGCCs). The proposed method provides performance improvement of Automatic Speaker Verification (ASV) using i-vector and Gaussian Probabilistic Discriminant Analysis GPLDA modeling under transmission channel noise. the extracted MLPGCCs was evaluated NIST 2008 database where single microphone recorded conversational speech. system is analyzed in presence different noises such as Additive White (AWGN) Rayleigh fading at various Signals to Noise Ratio (SNR) levels. evaluation results show that features are promising way ASV task. Indeed, speaker verification significantly improved compared conventional Frequency (GFCCs) Mel (MFCCs) features. For speech signals corrupted with AWGN noise SNRs ranging from (-5 dB 15 dB), obtain significant reduction Equal Error Rate (EER) 9.41% 6.65% 3.72% 1.50%, MFCCs GFCCs respectively. addition, when test achieve an EER 23.63% 7.8% 10.88% 6.8% GFCCs, We also found combination gives highest system. best achieved around 0.43% 0.59% 1.92% 3.88%.