作者: Abhay Prasad , Vijitha Periyasamy , Prasanta Kumar Ghosh
DOI: 10.1109/ICASSP.2015.7178775
关键词:
摘要: Speech articulation varies across speakers for producing a speech sound due to the differences in their vocal tract morphologies, though motor actions are executed terms of relatively invariant gestures [1]. While articulatory driven by linguistic content spoken utterance, component that reflects speaker-specific and other paralinguistic information. In this work, we present formulation decompose from multiple into variant aspects when they speak same sentence. The is found be better representation discriminating compared which includes part. Experiments with real-time magnetic resonance imaging (rtMRI) videos production reveal yields frame-level speaker identification accuracy as well acoustic features 29.9% 9.4% (absolute) respectively.