Hashtag sense clustering based on temporal similarity

作者: Giovanni Stilo , Paola Velardi

DOI: 10.1162/COLI_A_00277

关键词:

摘要: Hashtags are creative labels used in micro-blogs to characterize the topic of a message/discussion. Regardless use for which they were originally intended, hashtags cannot be as means cluster messages with similar content. First, because created spontaneous and highly dynamic way by users multiple languages, same can associated different hashtags, conversely, hashtag may refer topics time periods. Second, contrary common words, disambiguation is complicated fact that no sense catalogs e.g., Wikipedia or WordNet available; and, furthermore, difficult analyze, often consist acronyms, concatenated so forth. A determine meaning has been analyze their context, but, we have just pointed out, variable meanings. In this article, propose temporal clustering algorithm based on idea semantically related synchronous usage patterns.

参考文章(22)
Jianshu Weng, Bu-Sung Lee, None, Event Detection in Twitter international conference on weblogs and social media. ,(2011)
Lisa Posch, Claudia Wagner, Philipp Singer, Markus Strohmaier, Meaning as collective use: predicting semantic hashtag categories on twitter the web conference. pp. 621- 628 ,(2013) , 10.1145/2487788.2488008
Giovanni Stilo, Paola Velardi, Efficient temporal mining of micro-blog texts and its application to event discovery Data Mining and Knowledge Discovery. ,vol. 30, pp. 372- 402 ,(2016) , 10.1007/S10618-015-0412-3
Wei Feng, Chao Zhang, Wei Zhang, Jiawei Han, Jianyong Wang, Charu Aggarwal, Jianbin Huang, STREAMCUBE: Hierarchical spatio-temporal hashtag clustering for event exploration over the Twitter stream international conference on data engineering. pp. 1561- 1572 ,(2015) , 10.1109/ICDE.2015.7113425
Wouter Weerkamp, Manos Tsagkias, Simon Carter, Twitter hashtags: Joint Translation and Clustering ACM. ,(2011)
Jessica Lin, Eamonn Keogh, Stefano Lonardi, Bill Chiu, A symbolic representation of time series, with implications for streaming algorithms Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery - DMKD '03. pp. 2- 11 ,(2003) , 10.1145/882082.882086
John Hopcroft, Robert Tarjan, Algorithm 447: efficient algorithms for graph manipulation Communications of the ACM. ,vol. 16, pp. 372- 378 ,(1973) , 10.1145/362248.362272
Rishabh Mehrotra, Scott Sanner, Wray Buntine, Lexing Xie, Improving LDA topic models for microblogs via tweet pooling and automatic labeling international acm sigir conference on research and development in information retrieval. pp. 889- 892 ,(2013) , 10.1145/2484028.2484166
Anil K. Jain, Data clustering: 50 years beyond K-means international conference on pattern recognition. ,vol. 31, pp. 651- 666 ,(2010) , 10.1016/J.PATREC.2009.09.011
Wei Xie, Feida Zhu, Jing Jiang, Ee-Peng Lim, Ke Wang, TopicSketch: Real-Time Bursty Topic Detection from Twitter international conference on data mining. pp. 837- 846 ,(2013) , 10.1109/ICDM.2013.86