作者: Eleni Giannopoulou , Nikolas Mitrou
DOI: 10.1109/ACCESS.2020.3041651
关键词:
摘要: Book recommendation to support professors and students in the identification of relevant sources is significant importance for both universities digital libraries and, hence, motivates development a system. This paper aims at automatically classifying multiclass corpus that was created from ebooks Springer collection, which available through Hellenic Academic Libraries’ subscription, by utilizing an unsupervised neural network (NN) (self-organizing maps, SOM) two deep (DNN) architectures, namely, long short-term memory (LSTM) convolutional (CNN) combined with LSTM(CNN+LSTM) under various configuration scenarios. The vector construction leverages information extracted table contents (ToC) each book using TF-IDF weighting scheme (for first case) Keras tokenizer second). Extensive experiments were conducted configurations preprocessing steps, NN set up vocabulary sizes assess their impact on classifier’s performance. Furthermore, we show majority voting more suitable selecting dominant label specified node. experimental analysis showed feasibility developing system supporting related based detailed thematic description (e.g., abstract or book) rather than few keywords. In experiments, subsystem utilized DNN performed best, F1-scores 67% 26 categories 80% 5 general categories, whereas SOM realizes less 5% cases.