作者: Mushtag Ahmad , Stefan Gruner , Muhammad Tanvir Afzal , None
DOI:
关键词:
摘要: Medieval manuscripts or other written documents from that period contain valuable information about people, religion, and politics of the medieval period, making study a necessary pre-requisite to gaining in-depth knowl- edge history. Although tool-less such is possible has been ongoing for centuries, much subtle remains locked unless it gets revealed by effective means computational analysis. Automatic analy- sis non-trivial task mainly due non-conforming styles, spelling peculiarities, lack relational structures (hyper-links), which could be used answer meaningful queries. Natural Language Processing (NLP) tools algo- rithms are carry out analysis text data. However high percentage variations in manuscripts, NLP algorithms cannot applied directly If mapped standard dictionary words, then application al- gorithms becomes possible. In this paper we describe web-based software tool CAMM (Computational Analysis Manuscripts) maps vari- ations modern German dictionary. Here steps taken acquire, reformat, analyze data, produce putative mappings as well evaluate findings. At time writing paper, provides ac- cess 11275 organized into 54 collections containing total 242446 distinctly spelled words. accurately corrects 55% percent ver- ifiable freely available at http://researchworks.cs.athabascau.ca/