Research on topic discovery technology for Web news

作者: Guixian Xu , Ziheng Yu , Changzhi Wang , Antai Wang

DOI: 10.1007/S00521-018-3744-2

关键词:

摘要: With the development of information technology, Web news has become main way dissemination. topic discovery is useful for users to quickly find valuable and its research constantly improved. Traditional based on vector space model, but it defects such as high dimension data sparsity. However, latent semantic analysis can map high-dimensional sparse words k-dimensional improve similarity same by correlation between words. In this paper, studied. First, set text vectored weight each feature in texts calculated improved TFIDF. After original analysed analysis, relation fully exploited words, topics are extracted clustering approach. For extraction sub-topics, co-occurrence used display sub-topics. essence, sub-topic established through these The experimental results show that proposed method effectively capture current hot related It meaningful technology retrieval mining.

参考文章(19)
Nicholas E. Evangelopoulos, Latent semantic analysis. Wiley Interdisciplinary Reviews: Cognitive Science. ,vol. 4, pp. 683- 692 ,(2013) , 10.1002/WCS.1254
Bruce A. Austin, Portrait of an art film audience Journal of Communication. ,vol. 34, pp. 74- 87 ,(1983)
Luca Maria Aiello, Georgios Petkos, Carlos Martin, David Corney, Symeon Papadopoulos, Ryan Skraba, Ayse Goker, Ioannis Kompatsiaris, Alejandro Jaimes, Sensing Trending Topics in Twitter IEEE Transactions on Multimedia. ,vol. 15, pp. 1268- 1282 ,(2013) , 10.1109/TMM.2013.2265080
Lina Zhou, Dongsong Zhang, NLPIR: a theoretical framework for applying natural language processing to information retrieval Journal of the Association for Information Science and Technology. ,vol. 54, pp. 115- 123 ,(2003) , 10.1002/ASI.10193
Dongwen Zhang, Hua Xu, Zengcai Su, Yunfeng Xu, Chinese comments sentiment classification based on word2vec and SVMperf Expert Systems With Applications. ,vol. 42, pp. 1857- 1863 ,(2015) , 10.1016/J.ESWA.2014.09.011
Fangzhao Wu, Yongfeng Huang, Yangqiu Song, Structured microblog sentiment classification via social context regularization Neurocomputing. ,vol. 175, pp. 599- 609 ,(2016) , 10.1016/J.NEUCOM.2015.10.101
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman, Indexing by Latent Semantic Analysis Journal of the Association for Information Science and Technology. ,vol. 41, pp. 391- 407 ,(1990) , 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
Xiao-Ming ZHANG, Zhou-Jun LI, Wen-Han CHAO, Research of Automatic Topic Detection Based on Incremental Clustering Journal of Software. ,vol. 23, pp. 1578- 1587 ,(2012) , 10.3724/SP.J.1001.2012.04111
Yanghui Rao, Contextual Sentiment Topic Model for Adaptive Social Emotion Classification IEEE Intelligent Systems. ,vol. 31, pp. 41- 47 ,(2016) , 10.1109/MIS.2015.91
Vasilii A Gromov, Anton S Konev, None, Precocious identification of popular topics on Twitter with the employment of predictive clustering Neural Computing and Applications. ,vol. 28, pp. 3317- 3322 ,(2017) , 10.1007/S00521-016-2256-1