Multiple hypotheses at multiple scales for audio novelty computation within music

作者: Florian Kaiser , Geoffroy Peeters

DOI: 10.1109/ICASSP.2013.6637643

关键词:

摘要: Novelty-based segmentation of audio signals has proven good performances for the estimation boundaries structural sections within music pieces. However, are detected only if satisfy condition sufficient acoustic inner-homogeneity. While this constraint is very restrictive and not representative all musical contents, we propose in paper to extend detection novelty transitions between homogeneous non-homogeneous vice versa. Moreover, length considered boundary crucial, also introduce a multi-scale approach that allows capture different temporal scales same segmentation. Evaluation combination these two methods proves convincing results Embedding algorithm structure system, show can be consistently improved task.

参考文章(15)
Xavier Rodet, Geoffroy Peeters, Amaury La Burthe, Toward Automatic Music Audio Summary Generation from Signal Analysis international symposium/conference on music information retrieval. pp. 1- 1 ,(2002)
Peter Grosche, Josep L. l. Arcos, Meinard Müller, Joan Serrà, Unsupervised detection of music boundaries by time series structure features national conference on artificial intelligence. pp. 1613- 1619 ,(2012)
Meinard Müller, Michael Clausen, Transposition-Invariant Self-Similarity Matrices. international symposium/conference on music information retrieval. pp. 47- 50 ,(2007)
Ming Li, Ruofeng Chen, MUSIC STRUCTURAL SEGMENTATION BY COMBINING HARMONIC AND TIMBRAL INFORMATION international symposium/conference on music information retrieval. pp. 477- 482 ,(2011)
Frédéric Bimbot, Gabriel Sargent, Emmanuel Vincent, A REGULARITY-CONSTRAINED VITERBI ALGORITHM AND ITS APPLICATION TO THE STRUCTURAL SEGMENTATION OF SONGS international symposium/conference on music information retrieval. pp. 483- 488 ,(2011)
Bee Suan Ong, Structural analysis and segmentation of music signals Department of Information and Communication Technologies. ,(2007)
Mark Levy, Mark Sandler, Structural Segmentation of Musical Audio by Constrained Clustering IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 16, pp. 318- 326 ,(2008) , 10.1109/TASL.2007.910781
Scott Shaobing Chen, P.S. Gopalakrishnan, Clustering via the Bayesian information criterion with applications in speech recognition international conference on acoustics speech and signal processing. ,vol. 2, pp. 645- 648 ,(1998) , 10.1109/ICASSP.1998.675347
Jouni Paulus, Anssi Klapuri, Music Structure Analysis Using a Probabilistic Fitness Measure and a Greedy Search Algorithm IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 17, pp. 1159- 1170 ,(2009) , 10.1109/TASL.2009.2020533
A.L. Jacobson, Auto-threshold peak detection in physiological signals international conference of the ieee engineering in medicine and biology society. ,vol. 3, pp. 2194- 2195 ,(2001) , 10.1109/IEMBS.2001.1017206