作者: T.H. Falk , Wai-Yip Chan
DOI: 10.1109/TASL.2009.2023679
关键词:
摘要: In this paper, auditory inspired modulation spectral features are used to improve automatic speaker identification (ASI) performance in the presence of room reverberation. The signal representation is obtained by first filtering speech with a 23-channel gammatone filterbank. An eight-channel filterbank then applied temporal envelope each filter output. Features extracted from frequency bands ranging 3-15 H z and shown be robust mismatch between training testing conditions increasing reverberation levels. To demonstrate gains proposed features, experiments performed clean speech, artificially generated reverberant recorded meeting room. Simulation results show that Gaussian mixture model based ASI system, trained on consistently outperforms baseline system mel-frequency cepstral coefficients. For multimicrophone applications, three multichannel score combination adaptive channel selection techniques investigated further performance.