Voice pathology detection using auto-correlation of different filters bank

作者: Ahmed Al-nasheri , Zulfiqar Ali , Ghulam Muhammad , Mansour Alsulaiman

DOI: 10.1109/AICCSA.2014.7073178

关键词: Singular value decompositionRadio spectrumArtificial intelligenceClassifier (UML)Mixture modelSupport vector machineAutocorrelationPattern recognitionBand-pass filterComputer scienceSpeech recognitionFeature extraction

摘要: This paper investigates the contribution of frequency bands for automatic voice pathology detection. First, input signal is passed through a number time-domain band-pass filters. The center frequencies are spaced on an octave scale. Each filter output then divided into overlapping frames. Auto-correlation function applied to each block find first largest peak, in areas other than near dc value, and its corresponding lag. Therefore, frame having only these two features (peak value lag). As classifier, we use Gaussian mixture models (GMM) support vector machine (SVM), separately. Two well-known available databases, one English (MEEI) German (SVD), used investigation. results demonstrate that most significant range detect between 1500 Hz 3500 Hz. Using this band with features, accuracy above 97% case MEEI database.

参考文章(26)
Ali Zulfiqar, Aslam Muhammad, Ana Maria Martinez-Enriquez, G. Escalada-Imaz, Text-Independent Speaker Identification Using VQ-HMM Model Based Multiple Classifier System Advances in Soft Computing. pp. 116- 125 ,(2010) , 10.1007/978-3-642-16773-7_10
T. Kohonen, J. Kangas, J. Laaksonen, K. Torkkola, LVQPAK: A software package for the correct application of Learning Vector Quantization algorithms international joint conference on neural network. ,vol. 1, pp. 725- 730 ,(1992) , 10.1109/IJCNN.1992.287101
T. Matsui, S. Furui, Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMMs international conference on acoustics, speech, and signal processing. ,vol. 2, pp. 157- 160 ,(1992) , 10.1109/ICASSP.1992.226096
Leonard E. Baum, Ted Petrie, Statistical Inference for Probabilistic Functions of Finite State Markov Chains Annals of Mathematical Statistics. ,vol. 37, pp. 1554- 1563 ,(1966) , 10.1214/AOMS/1177699147
Ghulam Muhammad, Moutasem Melhem, Pathological voice detection and binary classification using MPEG-7 audio features Biomedical Signal Processing and Control. ,vol. 11, pp. 1- 9 ,(2014) , 10.1016/J.BSPC.2014.02.001
B. Boyanov, S. Hadjitodorov, Acoustic analysis of pathological voices. A voice analysis system for the screening of laryngeal diseases IEEE Engineering in Medicine and Biology Magazine. ,vol. 16, pp. 74- 82 ,(1997) , 10.1109/51.603651
Rubén Fraile, Juan Ignacio Godino-Llorente, Nicolás Sáenz-Lechón, Víctor Osma-Ruiz, Juana María Gutiérrez-Arriola, Characterization of Dysphonic Voices by Means of a Filterbank-Based Spectral Analysis: Sustained Vowels and Running Speech Journal of Voice. ,vol. 27, pp. 11- 23 ,(2013) , 10.1016/J.JVOICE.2012.07.004
B. S. Atal, Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification The Journal of the Acoustical Society of America. ,vol. 55, pp. 1304- 1312 ,(1974) , 10.1121/1.1914702