An Approach to Normalization of Dai Text for Speech Synthesis

作者：烛梅伍

DOI: 10.12677/CSA.2016.67051

关键词: Natural language processing 、 Artificial intelligence 、 Computer science

摘要: 本文以开发傣语语音合成系统为目的，重点研究傣语文本中的数字归一化和特殊字符归一化问题。数字和特殊字符都属于傣语文本中的非标准词，文本归一化的主要目的是用标准词表示非标准词的发音。归一化处理过程包括：非标准词识别、歧义判断、消歧处理和非标准词转换为标准词4个步骤。本文采用基于规则和上下文关键词相结合的方法识别非标准词，利用正则表达式判断其歧义类型，根据转换规则对非标准词进行消歧并确定其正确的傣文读音。实验结果表明，本文提出的文本归一化方法的正确率达到了94.6%，可以完全满足傣语文语转换系统前端文本分析的需求，并具有良好的自然语言处理应用价值。 With the purpose of developing a Dai speech synthesis system, this paper focuses on study numbers and special characters normalization. Both are non-standard words in text. The main text normalization is to represent pronunciation with standard words. process includes recognition, ambiguity judgment, disambiguation transla-tion. Firstly, recognized ambiguous types these non-stan- dard determined using method based rule-based context-keyword, paper. Then, judged regular expression. Lastly, correct no-standard according transformation rules. Experimental results show that rate more than 94.6%. This purposed can fully satisfy front-end analysis conversion has good natural language processing application value.

doi.org 本地加速

hanspub.org PDF 下载加速

sci-hub.st HTML 下载加速

参考文章(3)

Robin Haunschild, Lutz Bornmann, Normalization of Mendeley reader counts for impact assessment Journal of Informetrics. ,vol. 10, pp. 62- 73 ,(2016) , 10.1016/J.JOI.2015.11.003

Richard Sproat, Alan W. Black, Stanley Chen, Shankar Kumar, Mari Ostendorf, Christopher Richards, Normalization of non-standard words Computer Speech & Language. ,vol. 15, pp. 287- 333 ,(2001) , 10.1006/CSLA.2001.0169

Timothy Edmunds, Huw Hopkins, Broadcast system using text to speech conversion ,(2011)

An Approach to Normalization of Dai Text for Speech Synthesis

来源期刊

我的账户

An Approach to Normalization of Dai Text for Speech Synthesis

来源期刊

相似文章 0

我的账户