Learning Multi-Boosted HMMs for Lip-Password Based Speaker Verification

作者: Xin Liu , Yiu-ming Cheung

DOI: 10.1109/TIFS.2013.2293025

关键词:

摘要: This paper proposes a concept of lip motion password (simply called lip-password hereinafter), which is composed embedded in the movement and underlying characteristic motion. It provides double security to visual speaker verification system, where verified by both private information behavioral biometrics motions simultaneously. Accordingly, target saying wrong or an impostor who knows correct will be detected rejected. To this end, we shall present multi-boosted Hidden Markov model (HMM) learning approach such system. Initially, extract group representative features characterize each frame. Then, effective segmentation algorithm addressed segment sequence into small set distinguishable subunits. Subsequently, integrate HMMs with boosting framework associated random subspace method data sharing scheme formulate precise decision boundary for these subunits verification, featuring on high discrimination power. Finally, lip-password, whether spoken pre-registered not, identified based all subunit results learned from HMMs. The experimental show that proposed performs favorably compared state-of-the-art methods.

参考文章(43)
Eric Chang, Heung-Yeung Shum, Chengyuan Ma, Stan Z. Li, Dong Zhang, Learning to boost GMM based speaker verification. conference of the international speech communication association. ,(2003)
Chi Ho Chan, Budhaditya Goswami, Josef Kittler, William Christmas, Local Ordinal Contrast Pattern Histograms for Spatiotemporal, Lip-Based Speaker Authentication IEEE Transactions on Information Forensics and Security. ,vol. 7, pp. 602- 612 ,(2012) , 10.1109/TIFS.2011.2175920
J. Luettin, N.A. Thacker, S.W. Beet, Speaker identification by lipreading international conference on spoken language processing. ,vol. 1, pp. 62- 65 ,(1996) , 10.1109/ICSLP.1996.607030
L. Bahl, P. Brown, P. de Souza, R. Mercer, Maximum mutual information estimation of hidden Markov model parameters for speech recognition international conference on acoustics, speech, and signal processing. ,vol. 11, pp. 49- 52 ,(1986) , 10.1109/ICASSP.1986.1169179
M.W. Mak, W.G. Allen, Lip-motion analysis for speech segmentation in noise Speech Communication. ,vol. 14, pp. 279- 296 ,(1994) , 10.1016/0167-6393(94)90067-1
Yoav Freund, Robert E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting conference on learning theory. ,vol. 55, pp. 119- 139 ,(1997) , 10.1006/JCSS.1997.1504
Anindya Roy, Mathew Magimai.-Doss, Sébastien Marcel, A Fast Parts-Based Approach to Speaker Verification Using Boosted Slice Classifiers IEEE Transactions on Information Forensics and Security. ,vol. 7, pp. 241- 254 ,(2012) , 10.1109/TIFS.2011.2166387
Anuj Mehra, Mahender Kumawat, Rajiv Ranjan, Bipul Pandey, Sushil Ranjan, Anupam Shukla, Ritu Tiwari, Expert System for Speaker Identification Using Lip Features with PCA information security and assurance. pp. 1- 4 ,(2010) , 10.1109/IWISA.2010.5473241
Hamed Talea, Khashayar Yaghmaie, Automatic visual speech segmentation ieee international conference on communication software and networks. pp. 184- 188 ,(2011) , 10.1109/ICCSN.2011.6014877
F. Gustafsson, Determining the initial states in forward-backward filtering IEEE Transactions on Signal Processing. ,vol. 44, pp. 988- 992 ,(1996) , 10.1109/78.492552