Mel-frequency cepstral coefficient-based bandwidth extension of narrowband speech.

作者: Amr H. Nour-Eldin , Peter Kabal

DOI:

关键词:

摘要: Abstract We present a novel MFCC-based scheme for the BandwidthExtension (BWE) of narrowband speech. BWE is based onthe assumption that speech (0.3–3.4 kHz) cor-relates closely with highband signal (3.4–7 kHz), en-abling estimation frequency content given thenarrow band. While schemes have traditionally usedLP-based parametrizations, our recent work has shown thatMFCC parametrization results in higher correlation betweenboth bands reaching twice using LSFs. By employinghigh-resolution IDCT MFCCs obtained from nar-rowband by statistical estimation, we achieve high-quality power spectra which time-domainspeech can be reconstructed. Implementing this schemefor translates advantage MFCCsinto performance superior to LSFs,as improvements log-spectral distortion as well asItakura-based measures (the latter improving up 13%). Index Terms : Bandwidth extension, high-resolution IDCT,highband certainty, mutual information, source-filter model

参考文章(14)
Amr H. Nour-Eldin, Peter Kabal, Objective Analysis of the Effect of Memory Inclusion on Bandwidth Extension of Narrowband Speech conference of the international speech communication association. pp. 2489- 2492 ,(2007)
Mark A. Jasiuk, Tenkasi Ramabadran, Jeff Meunier, Bill Kushner, Enhancing distributed speech recognition with back- end speech reconstruction. conference of the international speech communication association. pp. 1859- 1862 ,(2001)
Yasheng Qian, Peter Kabal, Dual-Mode Wideband Speech Recovery from Narrowband Speech conference of the international speech communication association. ,(2003)
Xu Shao, Ben P. Milner, Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model conference of the international speech communication association. ,(2002)
Schuyler Reynier Quackenbush, Objective measures of speech quality Georgia Institute of Technology. ,(1995)
D. Chazan, R. Hoory, G. Cohen, M. Zibulski, Speech reconstruction from mel frequency cepstral coefficients and pitch frequency international conference on acoustics, speech, and signal processing. ,vol. 3, pp. 1299- 1302 ,(2000) , 10.1109/ICASSP.2000.861816
N. Enbom, W.B. Kleijn, Bandwidth expansion of speech based on vector quantization of the mel frequency cepstral coefficients 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351). pp. 171- 173 ,(1999) , 10.1109/SCFT.1999.781521
Sriram Srinivasan, Ashish Vijay Pandharipande, Speech signal processing ,(2009)
Peter Jax, Peter Vary, On artificial bandwidth extension of telephone speech Signal Processing. ,vol. 83, pp. 1707- 1719 ,(2003) , 10.1016/S0165-1684(03)00082-3
D. Wang, Jae Lim, The unimportance of phase in speech enhancement IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 30, pp. 679- 681 ,(1982) , 10.1109/TASSP.1982.1163920