On the Initialization of Dynamic Models for Speech Features

作者: Johannes Blömer , Reinhold Haeb-Umbach , Volker Leutnant , Alexander Krueger , Marcel R. Ackermann

DOI:

关键词:

摘要: In this work, a novel approach for the initialization of switching linear dynamic models (SLDMs) as trajectory speech features is proposed. Borrowing ideas from ”k-means++”-algorithm, goal to find distinctly different SLDMs, modelling complex dynamics features, already at stage subsequently following ”expectation-maximization (EM)”-algorithm. Experimental results comparing differently initialized SLDMs in model-based feature enhancement scheme show superiority proposed routine terms reduced word error rate on an automatic recognition task.

参考文章(5)
Steve Young, Gunnar Evermann, Mark Gales, Thomas Hain, Dan Kershaw, Xunying Liu, Gareth Moore, Julian Odell, Dave Ollason, Dan Povey, Valtcho Valtchev, Phil Woodland, The HTK book Cambridge University Engineering Department and Entrophic Cambridge Research Laboratory. ,(1995)
Valtchev, G Evermann, PC Woodland, G Moore, SJ Young, JJ Odell, D Kershaw, D Povey, DG Ollason, Mjf Gales, The HTK book version 3.4 Cambridge University Engineering Department. ,(2006)
David Arthur, Sergei Vassilvitskii, k-means++: the advantages of careful seeding symposium on discrete algorithms. pp. 1027- 1035 ,(2007) , 10.5555/1283383.1283494
Alexander Krueger, Reinhold Haeb-Umbach, Model-Based Feature Enhancement for Reverberant Speech Recognition IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 18, pp. 1692- 1707 ,(2010) , 10.1109/TASL.2010.2049684
Jianping Deng, Martin Bouchard, Tet Hin Yeap, None, Noisy Speech Feature Estimation on the Aurora2 Database using a Switching Linear Dynamic Model Journal of Multimedia. ,vol. 2, pp. 47- 52 ,(2007) , 10.4304/JMM.2.2.47-52