作者: Ulli Waltinger , Rüdiger Gleim , Alexander Mehler
DOI:
关键词:
摘要: Text categorization is a fundamental part in many NLP applications. In general, the Vector Space Model, Latent Semantic Analysis and Support Machine implementation have been successfully applied within this area. However, feature extraction most challenging task when conducting experiments. Moreover, sensitive reduction needed order to reduce time space complexity especially deal with singular value decomposition or larger sized text collections. paper we examine of by means closed topic models. We propose replacement technique generalization comprising user generated concepts social ontology. Derived are then subsequently used enhance replace existing features gaining minimum representation twenty concepts. effect each step classification process using large corpus 29,086 texts 30 different categories. addition, offer an easy-to-use web interface as eHumanities Desktop test proposed classifiers.