Estimation of the invariant and variant characteristics in speech articulation and its application to speaker identification

作者: Abhay Prasad , Vijitha Periyasamy , Prasanta Kumar Ghosh

DOI: 10.1109/ICASSP.2015.7178775

关键词:

摘要: Speech articulation varies across speakers for producing a speech sound due to the differences in their vocal tract morphologies, though motor actions are executed terms of relatively invariant gestures [1]. While articulatory driven by linguistic content spoken utterance, component that reflects speaker-specific and other paralinguistic information. In this work, we present formulation decompose from multiple into variant aspects when they speak same sentence. The is found be better representation discriminating compared which includes part. Experiments with real-time magnetic resonance imaging (rtMRI) videos production reveal yields frame-level speaker identification accuracy as well acoustic features 29.9% 9.4% (absolute) respectively.

参考文章(35)
J. Luettin, N.A. Thacker, S.W. Beet, Speaker identification by lipreading international conference on spoken language processing. ,vol. 1, pp. 62- 65 ,(1996) , 10.1109/ICSLP.1996.607030
M.J. Carey, E.S. Parris, H. Lloyd-Thomas, S. Bennett, Robust prosodic features for speaker identification international conference on spoken language processing. ,vol. 3, pp. 1800- 1803 ,(1996) , 10.1109/ICSLP.1996.607979
James W. Glenn, Norbert Kleiner, Speaker Identification Based on Nasal Phonation The Journal of the Acoustical Society of America. ,vol. 43, pp. 368- 372 ,(1968) , 10.1121/1.1910788
V.L. Gracco, J.H. Abbs, Variant and invariant characteristics of speech movements Experimental Brain Research. ,vol. 65, pp. 156- 166 ,(1986) , 10.1007/BF00243838
Mark K. Tiede, Vincent L. Gracco, Douglas M. Shiller, Carol Espy‐Wilson, Suzanne E. Boyce, Perturbed palatal shape and North American English /r/ production Journal of the Acoustical Society of America. ,vol. 117, pp. 2568- 2569 ,(2005) , 10.1121/1.4788555
Daniel J. Mashao, Marshalleno Skosan, Rapid and brief communication: Combining classifier decisions for robust speaker identification Pattern Recognition. ,vol. 39, pp. 147- 155 ,(2006) , 10.1016/J.PATCOG.2005.08.004
Julle Carson-Berndsen, Phonological processing of speech variants international conference on computational linguistics. pp. 21- 24 ,(1990) , 10.3115/991146.991150
H. Wakita, Direct estimation of the vocal tract shape by inverse filtering of acoustic speech waveforms IEEE Transactions on Audio and Electroacoustics. ,vol. 21, pp. 417- 427 ,(1973) , 10.1109/TAU.1973.1162506