Real-time frequency-based noise-robust Automatic Speech Recognition using Multi-Nets Artificial Neural Networks: A multi-views multi-learners approach

作者: Seyed Reza Shahamiri , Siti Salwah Binti Salim

DOI: 10.1016/J.NEUCOM.2013.09.040

关键词:

摘要: Automatic Speech Recognition (ASR) is a technology for identifying uttered word(s) represented as an acoustic signal. However, one of the important aspects noise-robust ASR system its ability to recognise speech accurately in noisy conditions. This paper studies applications Multi-Nets Artificial Neural Networks (M-N ANNs), realisation multiple-views multiple-learners approach, Multi-Networks Recognisers (M-NSRs) providing real-time, frequency-based model. M-NSRs define features associated with each word different view and apply standalone ANN learners approximate that view; meanwhile, single-learner (MVSL) ANN-based recognisers employ only memorise entire vocabulary. In this research, M-NSR was provided evaluated using unforeseen test data were affected by white, brown, pink noises; more specifically, 27 experiments conducted on measure accuracy recognition rate proposed Furthermore, results compared detail MVSL system. The recorded improved average up 20.14% when it given infected noise our experiments. It shown higher degree generalisability can handle because has than previous model under

参考文章(41)
Janez Demšar, Statistical Comparisons of Classifiers over Multiple Data Sets Journal of Machine Learning Research. ,vol. 7, pp. 1- 30 ,(2006)
Robert J. Schalkoff, Artificial neural networks ,(1997)
Shiliang Sun, Qingjiu Zhang, Multiple-View Multiple-Learner Semi-Supervised Learning Neural Processing Letters. ,vol. 34, pp. 229- 240 ,(2011) , 10.1007/S11063-011-9195-8
Isar Nejadgholi, Seyyed Ali Seyyedsalehi, Nonlinear normalization of input patterns to speaker variability in speech recognition neural networks Neural Computing and Applications. ,vol. 18, pp. 45- 55 ,(2009) , 10.1007/S00521-007-0151-5
A. Dev, S. S. Agrawal, D. R. Choudhury, Categorization of Hindi phonemes by neural networks Ai & Society. ,vol. 17, pp. 375- 382 ,(2003) , 10.1007/S00146-003-0263-0
S. Hu, R. Rajamani, X. Yu, Directional cancellation of acoustic noise for home window applications Applied Acoustics. ,vol. 74, pp. 467- 477 ,(2013) , 10.1016/J.APACOUST.2012.08.004
A. Waibel, H. Sawai, K. Shikano, Modularity and scaling in large phonemic neural networks IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 37, pp. 1888- 1898 ,(1989) , 10.1109/29.45535
Francesco Nesta, Marco Matassoni, Blind source extraction for robust speech recognition in multisource noisy environments Computer Speech & Language. ,vol. 27, pp. 703- 725 ,(2013) , 10.1016/J.CSL.2012.08.001
Seyed Reza Shahamiri, Wan M. N. Wan-Kadir, Suhaimi Ibrahim, Siti Zaiton Mohd Hashim, Artificial neural networks as multi-networks automated test oracle automated software engineering. ,vol. 19, pp. 303- 334 ,(2012) , 10.1007/S10515-011-0094-Z