Guslar: A framework for automated singing voice correction

作者: Elias Azarov , Maxim Vashkevich , Alexander Petrovsky

DOI: 10.1109/ICASSP.2014.6855142

关键词:

摘要: The paper presents a solution for singing voice processing that is used in karaoke application with automated correction 1 . intended purpose of the to automatically improve user's performance towards professional singer by implementation effects such as pitch correction, artificial polyphony, time stretching and other. proposed framework incorporates complete workflow including analysis, morphing synthesis. uses an original model voiced speech which represents each harmonic multicomponent function provides high quality conditions partial glottalization.

参考文章(23)
Masanori Morise, Hideki Kawahara, Hideki Banno, Toru Takahashi, Development of exploratory research tools based on TANDEM-STRAIGHT asia pacific signal and information processing association annual summit and conference. pp. 111- 120 ,(2009)
Arturo Camacho, John G. Harris, Swipe: a sawtooth waveform inspired pitch estimator for speech and music Journal of the Acoustical Society of America. ,vol. 124, pp. 1638- 1652 ,(2007) , 10.1121/1.2951592
Mattias Nilsson, Barbara Resch, Moo-Young Kim, W. Bastiaan Kleijn, A Canonical Representation of Speech international conference on acoustics, speech, and signal processing. ,vol. 4, pp. 849- 852 ,(2007) , 10.1109/ICASSP.2007.367046
Thad Hughes, Keir Mierle, Recurrent neural networks for voice activity detection international conference on acoustics, speech, and signal processing. pp. 7378- 7382 ,(2013) , 10.1109/ICASSP.2013.6639096
Alain de Cheveigné, Hideki Kawahara, YIN, a fundamental frequency estimator for speech and music The Journal of the Acoustical Society of America. ,vol. 111, pp. 1917- 1930 ,(2002) , 10.1121/1.1458024
Tuan V. Pham, Chien T. Tang, Michael Stadtschnitzer, Using Artificial Neural Network for Robust Voice Activity Detection Under Adverse Conditions 2009 IEEE-RIVF International Conference on Computing and Communication Technologies. pp. 1- 8 ,(2009) , 10.1109/RIVF.2009.5174662
B. Yegnanarayana, R.N.J. Veldhuis, Extraction of vocal-tract system characteristics from speech signals IEEE Transactions on Speech and Audio Processing. ,vol. 6, pp. 313- 327 ,(1998) , 10.1109/89.701359
J. Bonada, X. Serra, Synthesis of the Singing Voice by Performance Sampling and Spectral Models IEEE Signal Processing Magazine. ,vol. 24, pp. 67- 79 ,(2007) , 10.1109/MSP.2007.323266
Tian Wang, V. Cuperman, Robust voicing estimation with dynamic time warping international conference on acoustics speech and signal processing. ,vol. 1, pp. 533- 536 ,(1998) , 10.1109/ICASSP.1998.674485