An Improved Latent Dirichlet Allocation Model for Hot Topic Extraction

作者: Guolong Liu , Xiaofei Xu , Ying Zhu , Li Li

DOI: 10.1109/BDCLOUD.2014.55

关键词:

摘要: Micro blogging is fast becoming a dominant medium in social media and its impact evident our daily lives. A massive amount of information produced on basis. It observed that detecting hot topics can be very helpful for people to get essential quickly. But due short sparse features, high flood meaningless tweets other characteristics micro blogs, traditional topic detection methods are unable achieve desirable level performance. In this paper, we propose multi-attribute latent dirichlet allocation (MA-LDA) model, analysis model which the time tag attributes blogs incorporated into LDA model. By introducing variable about attribute, MA-LDA decide whether word should appear or not. Applying attribute allows rank core words results so expressiveness outcomes improved over Empirical evaluation real data sets demonstrate method able detect accurately efficiently with more terms associated each found. Our study provides strong evidence importance temporal factor extraction.

参考文章(16)
R. Papka, J. Allan, On-Line New Event Detection using Single Pass Clustering TITLE2: University of Massachusetts. ,(1998)
Ying Zhu, Li Li, Le Luo, Learning to Classify Short Text with Topic Model and External Knowledge Knowledge Science, Engineering and Management. pp. 493- 503 ,(2013) , 10.1007/978-3-642-39787-5_41
Fabian Abel, Qi Gao, Geert-Jan Houben, Ke Tao, Semantic Enrichment of Twitter Posts for User Profile Construction on the Social Web The Semanic Web: Research and Applications. pp. 375- 389 ,(2011) , 10.1007/978-3-642-21064-8_26
W. R. Gilks, Markov Chain Monte Carlo Encyclopedia of Biostatistics. ,(2005) , 10.1002/0470011815.B2A14021
David M Blei, Andrew Y Ng, Michael I Jordan, None, Latent dirichlet allocation Journal of Machine Learning Research. ,vol. 3, pp. 993- 1022 ,(2003) , 10.5555/944919.944937
Daniel Ramage, David Hall, Ramesh Nallapati, Christopher D. Manning, Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora empirical methods in natural language processing. pp. 248- 256 ,(2009) , 10.3115/1699510.1699543
Mario Cataldi, Luigi Di Caro, Claudio Schifanella, Emerging topic detection on Twitter based on temporal and social terms evaluation Proceedings of the Tenth International Workshop on Multimedia Data Mining. pp. 4- ,(2010) , 10.1145/1814245.1814249
Silong Zhang, Junyong Luo, Yan Liu, Dong Yao, Yu Tian, Hotspots Detection on Microblog 2012 Fourth International Conference on Multimedia Information Networking and Security. pp. 922- 925 ,(2012) , 10.1109/MINES.2012.118
David M. Blei, John D. Lafferty, Dynamic topic models Proceedings of the 23rd international conference on Machine learning - ICML '06. pp. 113- 120 ,(2006) , 10.1145/1143844.1143859