Geocoding textual documents through the usage of hierarchical classifiers

作者: Fernando Melo , Bruno Martins

DOI: 10.1145/2837689.2837690

关键词: Support vector machineGeocodingSupervised learningComputer scienceRepresentation (mathematics)Geospatial analysisInformation retrieval

摘要: In this paper, we evaluate automated techniques, based on a hierarchical representation for the Earth's surface and leveraging SVM classifiers, assigning geospatial coordinates to previously unseen documents, using only raw text as input evidence. We report experiments with Wikipedia documents in four different languages, two Twitter datasets from previous studies. obtained state-of-the-art results, showing that document geocoding can be handled effectively appropriate bag-of-words representations out-of-the-box supervised learning methods.

参考文章(54)
Barbara Johnstone, Language and place Cambridge University Press. pp. 203- 217 ,(2010) , 10.1017/CBO9780511997068.017
Marc Claesen, Bart De Moor, Jaak Simm, Dusan Popovic, Hyperparameter tuning in Python using Optunity International Workshop on Technical Comput- ing for Machine Learning and Mathematical Engineering (TCMM 2014). pp. 1- 1 ,(2014)
Ivo Anastácio, Bruno Martins, Pável Calado, Classifying Documents According to Locational Relevance portuguese conference on artificial intelligence. pp. 598- 609 ,(2009) , 10.1007/978-3-642-04686-5_49
Travis Brown, Jason Baldridge, Maria Esteva, Weijia Xu, The Substantial Words Are in the Ground and Sea: Computationally Linking Text and Geography Texas Studies in Literature and Language. ,vol. 54, pp. 324- 339 ,(2012) , 10.7560/TSLL54303
David M Blei, Andrew Y Ng, Michael I Jordan, None, Latent dirichlet allocation Journal of Machine Learning Research. ,vol. 3, pp. 993- 1022 ,(2003) , 10.5555/944919.944937
K. M. Gorski, E. Hivon, A. J. Banday, B. D. Wandelt, F. K. Hansen, M. Reinecke, M. Bartelmann, HEALPix: A Framework for High-Resolution Discretization and Fast Analysis of Data Distributed on the Sphere The Astrophysical Journal. ,vol. 622, pp. 759- 771 ,(2005) , 10.1086/427976
Jacob Eisenstein, Brendan O'Connor, Noah A. Smith, Eric P. Xing, Diffusion of Lexical Change in Social Media PLoS ONE. ,vol. 9, pp. e113114- 13 ,(2014) , 10.1371/JOURNAL.PONE.0113114
Roi Blanco, Christina Lioma, Graph-based term weighting for information retrieval Information Retrieval. ,vol. 15, pp. 54- 92 ,(2012) , 10.1007/S10791-011-9172-X
Einat Amitay, Nadav Har'El, Ron Sivan, Aya Soffer, Web-a-where Proceedings of the 27th annual international conference on Research and development in information retrieval - SIGIR '04. pp. 273- 280 ,(2004) , 10.1145/1008992.1009040
François Rousseau, Michalis Vazirgiannis, Graph-of-word and TW-IDF: new approach to ad hoc IR conference on information and knowledge management. pp. 59- 68 ,(2013) , 10.1145/2505515.2505671