Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis

作者: Tino Haderlein , Cornelia Schwemmle , Michael Döllinger , Václav Matoušek , Martin Ptok

DOI: 10.1155/2015/316325

关键词:

摘要: Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis measurements of connected speech were combined in order model the German Roughness-Breathiness-Hoarseness (RBH) scheme. 58 samples (43 women 15 men; 48.7 ± 17.8 years) containing version text “The North Wind Sun” evaluated perceptually 19 therapy students according RBH scale. For human-machine correlation, Support Vector Regression with vocal fold cycle irregularities (CFx) closed phases vibration (CQx) Laryngograph 33 features from a module used listeners' ratings. The best results for roughness obtained combination six CFx (r = 0.71, ρ 0.57). These correlations approximately same as agreement among human raters 0.65, 0.61). CQx was one substantial hoarseness model. breathiness, substantially lower. Nevertheless, method can serve basis meaningful objective support analysis.

参考文章(36)
Tino Haderlein, Cornelia Moers, Bernd Möbius, Elmar Nöth, Automatic Rating of Hoarseness by Text-based Cepstral and Prosodic Evaluation text speech and dialogue. pp. 573- 580 ,(2012) , 10.1007/978-3-642-32790-2_70
Dana M. Hartl, Stéphane Hans, Jacqueline Vaissière, Daniel F. Brasnu, Objective acoustic and aerodynamic measures of breathiness in paralytic dysphonia European Archives of Oto-rhino-laryngology. ,vol. 260, pp. 175- 182 ,(2003) , 10.1007/S00405-002-0542-2
M. Ptok, C. Iven, M. Jessen, C. Schwemmle, Objektiv gemessene Stimmlippenschwingungsirregularität vs. subjektiver Eindruck der Rauigkeit Hno. ,vol. 54, pp. 132- 138 ,(2006) , 10.1007/S00106-005-1250-1
Fourcin Aj, Abberton E, First applications of a new laryngograph. Medical & biological illustration. ,vol. 21, pp. 172- 182 ,(1971)
Viktor Zeißler, Johann Adelhardt, Anton Batliner, Carmen Frank, Elmar Nöth, Rui Ping Shi, Heinrich Niemann, The Prosody Module SmartKom: Foundations of Multimodal Dialogue Systems. pp. 139- 152 ,(2006) , 10.1007/3-540-36678-4_9
Paul C. Bagshaw, Mervyn A. Jack, Steven M. Hiller, Enhanced pitch tracking and the processing of f0 contours for computer aided intonation teaching. conference of the international speech communication association. ,(1993)
Alex J. Smola, Bernhard Schölkopf, A tutorial on support vector regression Statistics and Computing. ,vol. 14, pp. 199- 222 ,(2004) , 10.1023/B:STCO.0000035301.49549.88
P. H. Dejonckere, Patrick Bradley, Pais Clemente, Guy Cornut, Lise Crevier-Buchman, Gerhard Friedrich, Paul Van De Heyning, Marc Remacle, Virginie Woisard, A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. Guideline elaborated by the Committee on Phoniatrics of the European Laryngological Society (ELS). European Archives of Oto-rhino-laryngology. ,vol. 258, pp. 77- 82 ,(2001) , 10.1007/S004050000299
Jennifer Oates, Auditory-Perceptual Evaluation of Disordered Voice Quality Folia Phoniatrica Et Logopaedica. ,vol. 61, pp. 49- 56 ,(2009) , 10.1159/000200768