作者: Jonathan Dennis , Qiang Yu , Huajin Tang , Huy Dat Tran , Haizhou Li
DOI: 10.1109/ICASSP.2013.6637759
关键词:
摘要: There is much evidence to suggest that the human auditory system uses localised time-frequency information for robust recognition of sounds. Despite this, conventional systems typically rely on features extracted from short windowed frames over time, covering whole frequency spectrum. Such approaches are not inherently noise, as each frame will contain a mixture spectral noise and signal. Here, we propose novel approach based temporal coding Local Spectrogram Features (LSFs), which generate spikes used train Spiking Neural Network (SNN) with learning. LSFs represent location in spectrogram surrounding keypoints, detected signal-driven manner such effect reduced. Our experiments demonstrate performance our across variety conditions, it able outperform frame-based baseline methods.