作者: C. Spille , B. T. Meyer , M. Dietz , V. Hohmann
DOI: 10.1007/978-3-642-37762-4_6
关键词:
摘要: The segregation of concurrent speakers and other sound sources is an important aspect in improving the performance audio technology, such as noise reduction automatic speech recognition, ASR, difficult acoustic conditions. This technology relevant for applications like hearing aids, mobile devices, robotics, hands-free communication speech-based computer interfaces. Computational auditory-scene analysis (CASA) techniques simulate aspects processing properties human perceptual system using statistical signal-processing to improve inferences about causes input received by system. study argues that CASA a promising approach achieve source separation outlines several theoretical arguments support this hypothesis. With focus on computational binaural scene analysis, principles are reviewed. Furthermore, experimental approach, applicability recent model interaction ASR multi-speaker conditions with spatially separated moving explored. provides inference filter employs priori information possible movements order track positions speakers. tracks used adapt beamformer selects specific speaker. output subsequently task. Compared unprocessed, is, mixed, data two-speaker condition, word recognition rates obtained enhanced signals based were increased from 30.8 88.4 %, demonstrating potential proposed CASA-based approach.