作者: Emre Cakir , Toni Heittola , Heikki Huttunen , Tuomas Virtanen
DOI: 10.1109/IJCNN.2015.7280624
关键词:
摘要: In this paper, the use of multi label neural networks are proposed for detection temporally overlapping sound events in realistic environments. Real-life recordings typically have many events, making it hard to recognize each event with standard methods. Frame-wise spectral-domain features used as inputs train a deep network classification work. The model is evaluated from everyday environments and obtained overall accuracy 63.8%. method compared against state-of-the-art using non-negative matrix factorization pre-processing stage hidden Markov models classifier. improves by 19% percentage points overall.