Segmentation of Malay Syllables in Connected Digit Speech Using Statistical Approach

作者: Dzulkifli Mohamad , Md. Sah Salam

DOI:

关键词:

摘要: This study present segmentation of syllables in Malay connected digit speech. Segmentation was done time domain signal using statistical approaches namely the BrandtA¢â‚¬â„¢s Generalized Likelihood Ratio (GLR) algorithm and Divergence algorithm. These basically detect abrupt changes energy order to determine points. Patterns used this experiment are digits 11 speakers spoken read mode lab environment spontaneous classroom environment. The aim is get close match between reference points automatic Experiments were conducted see effect number auto regressive model p sliding window length L giving better paper reports finding four criterions ie. insertion, omissions, accuracy algorithms. result shows that divergence performed only slightly has opposite testing parameter compared GLR. Read comparison less omission but more insertion.

参考文章(10)
Safaa Jarifi, Olivier Rosec, Dominique Pastor, Brandt's GLR method & refined HMM segmentation for TTS synthesis application european signal processing conference. pp. 1- 4 ,(2005) , 10.5281/ZENODO.39112
Piero Cosi, John-Paul Hosom, Fabio Tesser, High performance Italian continuous "digit" recognition. conference of the international speech communication association. pp. 242- 245 ,(2000)
Olle Engstrand, Systematicity of phonetic variation in natural discourse Speech Communication. ,vol. 11, pp. 337- 346 ,(1992) , 10.1016/0167-6393(92)90039-A
Kurt S. Riedel, Detection of abrupt changes: theory and application Technometrics. ,vol. 36, pp. 550- ,(1993) , 10.1080/00401706.1994.10485821
L. Rabiner, M. Sambur, Some preliminary experiments in the recognition of connected digits IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 24, pp. 170- 182 ,(1976) , 10.1109/TASSP.1976.1162794
R. Andre-Obrecht, A new statistical approach for the automatic segmentation of continuous speech signals IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 36, pp. 29- 40 ,(1988) , 10.1109/29.1486
Jean-Luc Rouas, Jérôme Farinas, François Pellegrino, Régine André-Obrecht, Rhythmic unit extraction and modelling for automatic language identification Speech Communication. ,vol. 47, pp. 436- 456 ,(2005) , 10.1016/J.SPECOM.2005.04.012
Wei Wei, S. Van Vuuren, Improved neural network training of inter-word context units for connected digit recognition international conference on acoustics speech and signal processing. ,vol. 1, pp. 497- 500 ,(1998) , 10.1109/ICASSP.1998.674476