Evaluation of Features Extraction Algorithms for a Real-Time Isolated Word Recognition System

作者: Dalius Navakauskas , Gintautas Tamulevičius , Artūras Serackis , Tomyslav Sledevič

DOI:

关键词: Comparative evaluationField-programmable gate arraySoftware designWord recognitionRobustness (computer science)Mel-frequency cepstrumAlgorithmDynamic time warpingSpeech recognitionComputer scienceArtificial intelligencePattern recognitionCepstrum

摘要: Paper presents an comparative evaluation of features extraction algorithm for a real-time isolated word recognition system based on FPGA. The Mel-frequency cepstral, linear frequency predictive and their cepstral coefficients were implemented in hardware/software design. proposed was investigated speaker dependent mode 100 different Lithuanian words. robustness algorithms tested recognizing the speech records at signal to noise rates. experiments clean show highest accuracy coefficients. For with 15 dB rate gives best result. hard soft part is clocked 50 MHz accordingly. classification purpose pipelined dynamic time warping core implemented. satisfy requirements suitable applications embedded systems.

参考文章(19)
G. Čeidaitė, L. Telksnys, Analysis of Factors Influencing Accuracy of Speech Recognition Elektronika Ir Elektrotechnika. ,vol. 105, pp. 69- 72 ,(2010) , 10.5755/J01.EEE.105.9.9180
Mariusz Rawski, Michal Staworko, FPGA implementation of feature extraction algorithm for speaker verification international conference mixed design of integrated circuits and systems. pp. 557- 561 ,(2010)
Rytis Maskeliunas, A. Esposito, Multilingual Italian – Lithuanian Small Vocabulary Speech Recognition via Selection of Phonetic Transcriptions Elektronika Ir Elektrotechnika. ,vol. 121, pp. 85- 88 ,(2012) , 10.5755/J01.EEE.121.5.1145
C. Y. Fook, M. Hariharan, Sazali Yaacob, Adom Ah, Malay speech recognition in normal and noise condition international colloquium on signal processing and its applications. pp. 409- 412 ,(2012) , 10.1109/CSPA.2012.6194759
Ooi Chia Ai, M. Hariharan, Sazali Yaacob, Lim Sin Chee, Classification of speech dysfluencies with MFCC and LPCC features Expert Systems With Applications. ,vol. 39, pp. 2157- 2165 ,(2012) , 10.1016/J.ESWA.2011.07.065
Mohamed Atri, Fatma Sayadi, Wajdi Elhamzi, Rached Tourki, Efficient Hardware/Software Implementation of LPC Algorithm in Speech Coding Applications Journal of Signal and Information Processing. ,vol. 3, pp. 122- 129 ,(2012) , 10.4236/JSIP.2012.31016
Xinhui Zhou, Daniel Garcia-Romero, Ramani Duraiswami, Carol Espy-Wilson, Shihab Shamma, Linear versus mel frequency cepstral coefficients for speaker recognition ieee automatic speech recognition and understanding workshop. pp. 559- 564 ,(2011) , 10.1109/ASRU.2011.6163888
Jun Xu, Aladdin Ariyaeeinia, Reza Sotudeh, Migrate Levinson-Durbin based Linear Predictive Coding algorithm into FPGAS international conference on electronics, circuits, and systems. pp. 1- 4 ,(2005) , 10.1109/ICECS.2005.4633388
Tomyslav Sledevic, Dalius Navakauskas, FPGA based fast Lithuanian isolated word recognition system Eurocon 2013. pp. 1630- 1636 ,(2013) , 10.1109/EUROCON.2013.6625195
Shing-Tai Pan, Xu-Yu Li, An FPGA-Based Embedded Robust Speech Recognition System Designed by Combining Empirical Mode Decomposition and a Genetic Algorithm IEEE Transactions on Instrumentation and Measurement. ,vol. 61, pp. 2560- 2572 ,(2012) , 10.1109/TIM.2012.2190344