Polyphonic sound event detection using multi label deep neural networks

作者： Emre Cakir , Toni Heittola , Heikki Huttunen , Tuomas Virtanen

DOI: 10.1109/IJCNN.2015.7280624

关键词:

摘要: In this paper, the use of multi label neural networks are proposed for detection temporally overlapping sound events in realistic environments. Real-life recordings typically have many events, making it hard to recognize each event with standard methods. Frame-wise spectral-domain features used as inputs train a deep network classification work. The model is evaluated from everyday environments and obtained overall accuracy 63.8%. method compared against state-of-the-art using non-negative matrix factorization pre-processing stage hidden Markov models classifier. improves by 19% percentage points overall.

参考文章(17)

Toni Heittola, Antti Eronen, Annamaria Mesaros, Tuomas Virtanen, Acoustic event detection in real life recordings european signal processing conference. pp. 1267- 1271 ,(2010)

Aki Harma, Martin F McKinney, Janto Skowronek, Automatic surveillance of the acoustic activity in our living environment international conference on multimedia and expo. pp. 634- 637 ,(2005) , 10.1109/ICME.2005.1521503

S. Kullback, R. A. Leibler, On Information and Sufficiency Annals of Mathematical Statistics. ,vol. 22, pp. 79- 86 ,(1951) , 10.1214/AOMS/1177729694

Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, Paris Smaragdis, Deep learning for monaural speech separation international conference on acoustics, speech, and signal processing. pp. 1562- 1566 ,(2014) , 10.1109/ICASSP.2014.6853860

Toni Heittola, Annamaria Mesaros, Tuomas Virtanen, Moncef Gabbouj, Supervised model training for overlapping sound events based on unsupervised source separation international conference on acoustics, speech, and signal processing. pp. 8677- 8681 ,(2013) , 10.1109/ICASSP.2013.6639360

Pawel Swietojanski, Jinyu Li, Jui-Ting Huang, INVESTIGATION OF MAXOUT NETWORKS FOR SPEECH RECOGNITION international conference on acoustics, speech, and signal processing. pp. 7649- 7653 ,(2014) , 10.1109/ICASSP.2014.6855088

George E. Dahl, Tara N. Sainath, Geoffrey E. Hinton, Improving deep neural networks for LVCSR using rectified linear units and dropout international conference on acoustics, speech, and signal processing. pp. 8609- 8613 ,(2013) , 10.1109/ICASSP.2013.6639346

Yoshua Bengio, Learning Deep Architectures for AI ,(2009)

Onur Dikmen, Annamaria Mesaros, Sound event detection using non-negative dictionaries learned from annotated overlapping events workshop on applications of signal processing to audio and acoustics. pp. 1- 4 ,(2013) , 10.1109/WASPAA.2013.6701861

10.

J. Dennis, H.D. Tran, E.S. Chng, Overlapping sound event recognition using local spectrogram features and the generalised hough transform Pattern Recognition Letters. ,vol. 34, pp. 1085- 1093 ,(2013) , 10.1016/J.PATREC.2013.02.015

Polyphonic sound event detection using multi label deep neural networks

来源期刊

我的账户

Polyphonic sound event detection using multi label deep neural networks

来源期刊

相似文章 10

我的账户