作者: Heiko Maus , Sven Schwarz , Christian Jilek , Andreas Dengel , Markus Schröder
DOI: 10.4230/OASICS.LDK.2019.11
关键词: Ontology (information science) 、 Process (engineering) 、 Ontology 、 Named-entity recognition 、 Artificial intelligence 、 German 、 Information extraction 、 Quality (business) 、 Natural language processing 、 Word (computer architecture) 、 Precision and recall 、 Computer science 、 Task (project management)
摘要: A growing number of applications users daily interact with have to operate in (near) real-time: chatbots, digital companions, knowledge work support systems -- just name a few. To perform the services desired by user, these analyze user activity logs or explicit input extremely fast. In particular, text content (e.g. form snippets) needs be processed an information extraction task. Regarding aforementioned temporal requirements, this has accomplished few milliseconds, which limits methods that can applied. Practically, only very fast remain, on other hand deliver worse results than slower but more sophisticated Natural Language Processing (NLP) pipelines. paper, we investigate and propose for real-time capable Named Entity Recognition (NER). As first improvement step address are word variations induced inflection, example present German language. Our approach is ontology-based makes use several language sources like Wiktionary. We evaluated it using Wikipedia (about 9.4B characters), whole NER process took considerably less hour. Since precision recall higher comparably methods, conclude quality gap between high speed NLP pipelines narrowed bit without losing too much runtime performance.