作者: Christoffer Aminoff , Aleksei Romanenko , Onni Kosomaa , Jouko Vankka
DOI: 10.1007/978-3-319-99972-2_47
关键词:
摘要: In this paper, a document classification system is enhanced through the construction of text augmentation technique by testing various Part-of-Speech filters and word vector weighting methods with nine different models for representation. Subject/object tagging introduced as new form augmentation, along novel grounded in method based on distribution words among classes documents. When an including subject/object tagging, nouns+adjectives filter Inverse Document Frequency was applied, average increase accuracy 4.1% points observed.