A spectral/temporal method for robust fundamental frequency tracking

关键词:

摘要: In this paper, a fundamental frequency (F(0)) tracking algorithm is presented that extremely robust for both high quality and telephone speech, at signal to noise ratios ranging from clean speech very noisy speech. The named "YAAPT," "yet another pitch tracking." based on combination of time domain processing, using the normalized cross correlation, processing. Major steps include processing original acoustic nonlinearly processed version signal, use new method computing modified autocorrelation function incorporates information multiple spectral harmonic peaks, peak picking select F(0) candidates associated figures merit, extensive dynamic programming find "best" track among candidates. was evaluated by three databases compared other published algorithms various conditions. For error rates obtained are comparable those with best results reported any algorithm; lower than methods.

scitation.org 本地加速

binghamton.edu PDF 下载加速

researchgate.net LINK 下载加速

binghamton.edu PDF 下载加速

aip.org LINK 下载加速

binghamton.edu PDF 下载加速

sci-hub.se PDF 下载加速

参考文章(19)

William A. Ainsworth, Georg F. Meyer, Fabrice Plante, A pitch extraction reference database. conference of the international speech communication association. ,(1995)

Wolfgang Hess, Pitch Determination of Speech Signals Springer Berlin Heidelberg. ,(1983) , 10.1007/978-3-642-81926-1

Stephen A. Zahorian, Hongbing Hu, Princy Dikshit, A Spectral-Temporal Method for Pitch Tracking conference of the international speech communication association. ,(2006)

Eric Chang, Shuo Di, Jian-Lai Zhou, Chao Huang, Kai-Fu Lee, Large vocabulary Mandarin speech recognition with different approaches in modeling tones. conference of the international speech communication association. pp. 983- 986 ,(2000)

Stephanie Seneff, Chao Wang, A study of tones and tempo in continuous Mandarin digit strings and their application in telephone quality speech recognition. conference of the international speech communication association. ,(1998)

M. Ostendorf, K. Ross, A Multi-level Model for Recognition of Intonation Labels Computing Prosody. pp. 291- 308 ,(1997) , 10.1007/978-1-4612-2258-3_19

E. Mousset, W.A. Ainsworth, J.A.R. Fonollosa, A comparison of several recent methods of fundamental frequency and voicing decision estimation international conference on spoken language processing. ,vol. 2, pp. 1273- 1276 ,(1996) , 10.1109/ICSLP.1996.607842

P. Boersma, Praat, a system for doing phonetics by computer Glot International. ,vol. 5, pp. 341- 345 ,(2002)

Chao Wang, S. Seneff, Robust pitch tracking for prosodic modeling in telephone speech international conference on acoustics, speech, and signal processing. ,vol. 3, pp. 1343- 1346 ,(2000) , 10.1109/ICASSP.2000.861827

10.

Tomohiro Nakatani, Toshio Irino, Robust and accurate fundamental frequency estimation based on dominant harmonic components Journal of the Acoustical Society of America. ,vol. 116, pp. 3690- 3700 ,(2004) , 10.1121/1.1787522

A spectral/temporal method for robust fundamental frequency tracking

来源期刊

我的账户

A spectral/temporal method for robust fundamental frequency tracking

来源期刊

相似文章 10

我的账户