作者: Nelson Morgan , Suman V. Ravuri , Sherry Y. Zhao
DOI:
关键词:
摘要: We report progress in the use of multi-stream spectro-temporal features for both small and large vocabulary automatic speech recognition tasks. Features are divided into multiple streams parallel processing dynamic utilization this approach. For experiments, incorporation up to 28 dynamically-weighted feature along with MFCCs yields roughly 21% improvement on baseline low noise conditions 47% noise-added conditions, a greater than our previous work. A four stream framework 14% over experiment. These results suggest that division may be an effective way flexibly utilize inherently number recognition.