作者: Daniel Esser , Daniel Schuster , Klemens Muthmann , Michael Berger , Alexander Schill
DOI: 10.1117/12.908542
关键词: Full text search 、 Automatic indexing 、 Information extraction 、 Computer science 、 Well-formed document 、 Index term 、 Document clustering 、 Information retrieval 、 Index (publishing) 、 Document management system
摘要: Archiving official written documents such as invoices, reminders and account statements in business private area gets more important. Creating appropriate index entries for document archives like sender's name, creation date or number is a tedious manual work. We present novel approach to handle automatic indexing of based on generic positional extraction terms. For this purpose we apply the knowledge templates stored common full text search find positions that were successfully extracted the past.