作者: Nabil Alami , Noureddine En-nahnahi , Said Alaoui Ouatik , Mohammed Meknassi
DOI: 10.1007/S13369-018-3198-Y
关键词:
摘要: Traditional Arabic text summarization (ATS) systems are based on bag-of-words representation, which involve a sparse and high-dimensional input data. Thus, dimensionality reduction is greatly needed to increase the power of features discrimination. In this paper, we present new method for ATS using variational auto-encoder (VAE) model learn feature space from We explore several representations such as term frequency (tf), tf-idf both local global vocabularies. All sentences ranked according latent representation produced by VAE. investigate impact VAE with two approaches, graph-based query-based approaches. Experiments benchmark datasets specifically designed show that vocabularies clearly provides more discriminative improves recall other models. Experiment results confirm proposed leads better performance than most state-of-the-art extractive approaches