Lexicon splitting in lexical disambiguation for Malay morphological analysis and stemming

作者: Mohd Yunus Sharum , Muhamad Taufik Abdullah , Md Nasir Sulaiman , Masrah Azrifah Azmi Murad , Zaitul Azma Zainon Hamzah

DOI: 10.4156/JNIT.VOL4.ISSUE5.2

关键词: HomonymContext analysisMalaySpeech recognitionArtificial intelligenceLexiconException handlingFeature (linguistics)AmbiguityWord (computer architecture)Natural language processingComputer science

摘要: Lexical ambiguity is one of the problems faced by morphological analyser and stemmer. It caused by ambiguous word form like homonym, which could direct tools to produce incorrect output. Thus a method that can resolve may improve performance such tools. Malay affixation differentiates between monosyllable multisyllable word. A disambiguation proposed for use lexicon analysis stemming, splitting into words. We found this feature help involving words, language’s exception handling storage lookup.This would be useful stemming as does not require document-level context analysed

参考文章(5)
Mirna Adriani, Jelita Asian, Bobby Nazief, S. M.M. Tahaghoghi, Hugh E. Williams, Stemming Indonesian: A confix-stripping approach ACM Transactions on Asian Language Information Processing. ,vol. 6, pp. 1- 33 ,(2007) , 10.1145/1316457.1316459
Tengku Mohd Tengku Sembok, Ramlan Mahmod, Fatimah Ahmad, Muhammad Taufik Abdullah, Rules Frequency Order Stemmer for Malay Language International Journal of Computer Science and Network Security (IJCSNS). ,(2009)
Zhijuan Deng , Shaojun Zhong , A Kind of Text Classification Design on the Basis of Natural Language Processing International Journal of Advancements in Computing Technology. ,vol. 5, pp. 668- 677 ,(2013) , 10.4156/IJACT.VOL5.ISSUE1.74
Lilac, Natural Language Processing for Conceptual Modeling International Journal of Digital Content Technology and Its Applications. ,vol. 3, ,(2009) , 10.4156/JDCTA.VOL3.ISSUE3.6