Deep-learning-based segmentation of the vocal tract and articulators in real-time magnetic resonance images of speech.

作者: Matthieu Ruthven , Marc E. Miquel , Andrew P. King

DOI: 10.1016/J.CMPB.2020.105814

关键词:

摘要: Abstract Background and Objective Magnetic resonance (MR) imaging is increasingly used in studies of speech as it enables non-invasive visualisation the vocal tract articulators, thus providing information about their shape, size, motion position. Extraction this for quantitative analysis achieved using segmentation. Methods have been developed to segment tract, however, none these also fully any articulators. The objective work was develop a method multiple groups articulators well two-dimensional MR images speech, overcoming limitations existing methods. Five image sets (392 total), each different healthy adult volunteer, were work. A convolutional network with an architecture similar original U-Net following six regions sets: head, soft palate, jaw, tongue, tooth space. five-fold cross-validation performed investigate segmentation accuracy generalisability network. assessed standard overlap-based metrics (Dice coefficient general Hausdorff distance) novel clinically relevant metric based on velopharyngeal closure. Results segmentations created by had median Dice 0.92 distance 5mm. segmented head most accurately (median 0.99), palate space least coefficients 0.93 respectively). correctly showed 90% (27 out 30) closures sets. Conclusions An automatic successfully developed. intended use clinical non-clinical which involve position In addition, assessing articulator methods

参考文章(33)
Philipp Fischer, Thomas Brox, None, U-Net: Convolutional Networks for Biomedical Image Segmentation medical image computing and computer assisted intervention. pp. 234- 241 ,(2015) , 10.1007/978-3-319-24574-4_28
Jonathan Long, Evan Shelhamer, Trevor Darrell, Fully convolutional networks for semantic segmentation computer vision and pattern recognition. pp. 3431- 3440 ,(2015) , 10.1109/CVPR.2015.7298965
C. Drissi, M. Mitrofanoff, C. Talandier, C. Falip, V. Le Couls, C. Adamsbaum, Feasibility of dynamic MRI for evaluating velopharyngeal insufficiency in children European Radiology. ,vol. 21, pp. 1462- 1469 ,(2011) , 10.1007/S00330-011-2069-7
Andrew D. Scott, Marzena Wylezinska, Malcolm J. Birch, Marc E. Miquel, Speech MRI: morphology and function. Physica Medica. ,vol. 30, pp. 604- 618 ,(2014) , 10.1016/J.EJMP.2014.05.001
Samuel Silva, António Teixeira, Unsupervised segmentation of the vocal tract from real-time MRI sequences Computer Speech & Language. ,vol. 33, pp. 25- 46 ,(2015) , 10.1016/J.CSL.2014.12.003
Christopher Carignan, Ryan K. Shosted, Maojing Fu, Zhi-Pei Liang, Bradley P. Sutton, A real-time MRI investigation of the role of lingual and pharyngeal articulation in the production of the nasal vowel system of French Journal of Phonetics. ,vol. 50, pp. 34- 51 ,(2015) , 10.1016/J.WOCN.2015.01.001
Ambros J. Beer, Paul Hellerhoff, Angela Zimmermann, Katalin Mady, Robert Sader, Ernst J. Rummeny, Christian Hannig, Dynamic near-real-time magnetic resonance imaging for analyzing the velopharyngeal closure in comparison with videofluoroscopy Journal of Magnetic Resonance Imaging. ,vol. 20, pp. 791- 797 ,(2004) , 10.1002/JMRI.20197
Rachel A. Ruotolo, Nestor A. Veitia, Aaron Corbin, Joseph McDonough, Cynthia B. Solot, Donna McDonald-McGinn, Elaine H. Zackai, Beverly S. Emanuel, Avital Cnaan, Don LaRossa, Raanan Arens, Richard E. Kirschner, Velopharyngeal anatomy in 22q11.2 deletion syndrome: a three-dimensional cephalometric analysis. The Cleft Palate-Craniofacial Journal. ,vol. 43, pp. 446- 456 ,(2006) , 10.1597/04-193R.1