Tree distributions approximation model for robust discrete speech recognition

作者: Nacereddine Hammami , Mouldi Bedda , Nadir Farah

DOI: 10.1007/S10772-012-9141-9

关键词: Word (computer architecture)Tree (data structure)Computer scienceSpeech recognitionPattern recognitionSpanning treeHidden Markov modelArtificial intelligenceTree structureSIGNAL (programming language)Graphical modelStructure (mathematical logic)

摘要: This paper proposes a new discrete speech recognition method which investigates the capability of graphical models based on tree distributions that are widely used in many optimization areas. A novel spanning structure utilizes temporal nature signal is proposed. The proposed significantly reduces complexity so far can reflect simply few essential relationships rather than all possible structures trees. application this model illustrated with different isolated word databases. Experimentally it has been shown that, approaches compared to conventional hidden Markov (DHMM) yield reduced error rates 2.54 %–12 % and improve speed minimum 3-fold. In addition, an impressive gain learning time observed. overall accuracy was 93.09 %–95.34 %, thereby confirming effectiveness methods.

参考文章(21)
Marina Meila, An Accelerated Chow and Liu Algorithm: Fitting Tree Distributions to High-Dimensional Sparse Data international conference on machine learning. pp. 249- 257 ,(1999)
Vincent Y. F. Tan, Animashree Anandkumar, Alan S. Willsky, Learning Gaussian Tree Models: Analysis of Error Exponents and Extremal Structures IEEE Transactions on Signal Processing. ,vol. 58, pp. 2701- 2714 ,(2010) , 10.1109/TSP.2010.2042478
N. Hammami, M. Bedda, N. Farah, HMM parameters estimation based on cross-validation for Spoken Arabic Digits recognition international conference on communications, computing and control applications. pp. 1- 4 ,(2011) , 10.1109/CCCA.2011.6031396
Sanaa El Fkihi, Mohamed Daoudi, Driss Aboutajdine, The mixture of K-Optimal-Spanning-Trees based probability approximation: Application to skin detection Image and Vision Computing. ,vol. 26, pp. 1574- 1590 ,(2008) , 10.1016/J.IMAVIS.2008.02.003
Antonio Miguel, Alfonso Ortega, Luis Buera, Eduardo Lleida, Bayesian Networks for Discrete Observation Distributions in Speech Recognition IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 19, pp. 1476- 1489 ,(2011) , 10.1109/TASL.2010.2092764
N. Hammami, M. Sellam, Tree distribution classifier for automatic spoken Arabic digit recognition international conference for internet technology and secured transactions. pp. 1- 4 ,(2009) , 10.1109/ICITST.2009.5402575
Songfang Huang, S Renals, Hierarchical Bayesian Language Models for Conversational Speech Recognition IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 18, pp. 1941- 1954 ,(2010) , 10.1109/TASL.2010.2040782
J.A. Bilmes, C. Bartels, Graphical model architectures for speech recognition IEEE Signal Processing Magazine. ,vol. 22, pp. 89- 100 ,(2005) , 10.1109/MSP.2005.1511827
A. Wiesel, Y.C. Eldar, A.O. Hero, Covariance Estimation in Decomposable Gaussian Graphical Models IEEE Transactions on Signal Processing. ,vol. 58, pp. 1482- 1492 ,(2010) , 10.1109/TSP.2009.2037350