A linguistic approach for determining the topics of Spanish Twitter messages

作者: David Vilares , Miguel A. Alonso , Carlos Gómez-Rodríguez

DOI: 10.1177/0165551514561652

关键词:

摘要: The vast number of opinions and reviews provided in Twitter is helpful order to make interesting findings about a given industry, but the huge messages published every day, it important detect relevant ones. In this respect, search functionality not practical tool when we want poll dealing with set general topics. This article presents an approach classify into various We tackle problem from linguistic angle, taking account part-of-speech, syntactic semantic information, showing how language processing techniques should be adapted deal informal present messages. TASS 2013 General corpus, collection tweets that has been specifically annotated perform text analytics tasks, used as dataset our evaluation framework. carry out wide range experiments determine which kinds information have greatest impact on task they combined obtain best-performing system. results lead us conclude relating features by means contextual adds complementary knowledge over pure lexical models, making possible outperform them standard metrics for multilabel classification tasks.

参考文章(42)
Chanattha Thongsuk, Choochart Haruechaiyasak, Somkid Saelee, Multi-classification of business types on twitter based on topic model international conference on electrical engineering electronics computer telecommunications and information technology. pp. 508- 511 ,(2011) , 10.1109/ECTICON.2011.5947886
Fabrizio Sebastiani, Machine learning in automated text categorization ACM Computing Surveys. ,vol. 34, pp. 1- 47 ,(2002) , 10.1145/505282.505283
Kalina Bontcheva, Dominic Rout, Making Sense of Social Media Streams through Semantics: a Survey Social Work. ,vol. 5, pp. 373- 403 ,(2014) , 10.3233/SW-130110
Mahesh Joshi, Carolyn Penstein-Rosé, Generalizing Dependency Features for Opinion Mining meeting of the association for computational linguistics. pp. 313- 316 ,(2009) , 10.3115/1667583.1667680
John Krogstie, Muhammad Asif, Mobile Services Personalization Evaluation Model International Journal of u- and e- Service, Science and Technology. ,vol. 6, pp. 1- 12 ,(2013)
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten, The WEKA data mining software ACM SIGKDD Explorations Newsletter. ,vol. 11, pp. 10- 18 ,(2009) , 10.1145/1656274.1656278
Kathy Lee, Diana Palsetia, Ramanathan Narayanan, Md. Mostofa Ali Patwary, Ankit Agrawal, Alok Choudhary, Twitter Trending Topic Classification international conference on data mining. pp. 251- 258 ,(2011) , 10.1109/ICDMW.2011.171
EUGENIO MARTÍNEZ-CÁMARA, M. TERESA MARTÍN-VALDIVIA, L. ALFONSO UREÑA-LÓPEZ, A RTURO MONTEJO-RÁEZ, Sentiment analysis in Twitter Natural Language Engineering. ,vol. 20, pp. 1- 28 ,(2014) , 10.1017/S1351324912000332
Jon M. Kleinberg, Authoritative sources in a hyperlinked environment Journal of the ACM. ,vol. 46, pp. 604- 632 ,(1999) , 10.1145/324133.324140
Miguel Ángel García Cumbreras, María Teresa Martín Valdivia, Eugenio Martínez Cámara, Luis Alfonso Ureña López, SINAI en TASS 2012 Procesamiento Del Lenguaje Natural. ,vol. 50, pp. 53- 60 ,(2013)