A natural language processing and geospatial clustering framework for harvesting local place names from geotagged housing advertisements

作者: Yingjie Hu , Huina Mao , Grant McKenzie

DOI: 10.1080/13658816.2018.1458986

关键词:

摘要: Local place names are frequently used by residents living in a geographic region. Such may not be recorded existing gazetteers, due to their vernacular nature, relative insignificance gazetteer covering large area (e.g., the entire world), recent establishment name of newly-opened shopping center), or other reasons. While always recorded, local play important roles many applications, from supporting public participation urban planning locating victims disaster response. In this paper, we propose computational framework for harvesting geotagged housing advertisements. We make use those advertisements posted on local-oriented websites, such as Craigslist, where often mentioned. The proposed consists two stages: natural language processing (NLP) and geospatial clustering. NLP stage examines textual content advertisements, extracts candidates. focuses coordinates associated with extracted candidates, performs multi-scale clustering filter out non-place names. evaluate our comparing its performance six baselines. also compare result four gazetteers demonstrate not-yet-recorded discovered framework.

参考文章(46)
Matthew T. Rice, Ahmad O. Aburizaiza, R. Daniel Jacobson, Brandon M. Shore, Fabiana I. Paez, Supporting Accessibility for Blind and Vision‐impaired People With a Localized Gazetteer and Open Source Geotechnology Transactions in Gis. ,vol. 16, pp. 177- 190 ,(2012) , 10.1111/J.1467-9671.2012.01318.X
S. J. Sheather, M. C. Jones, A reliable data-based bandwidth selection method for kernel density estimation Journal of the royal statistical society series b-methodological. ,vol. 53, pp. 683- 690 ,(1991) , 10.1111/J.2517-6161.1991.TB01857.X
Brent Hecht, Martin Raubal, GeoSR: Geographically explore semantic relations in world knowledge geographic information science. pp. 95- 113 ,(2008) , 10.1007/978-3-540-78946-8_6
Susana Ladra, Miguel R. Luaces, Oscar Pedreira, Diego Seco, A Toponym Resolution Service Following the OGC WPS Standard web and wireless geographical information systems. pp. 75- 85 ,(2008) , 10.1007/978-3-540-89903-7_8
Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, Christian Bizer, DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia Social Work. ,vol. 6, pp. 167- 195 ,(2015) , 10.3233/SW-140134
Florian A. Twaroch, Christopher B. Jones, Alia I. Abdelmoty, Acquisition of Vernacular Place Names from Web Sources Weaving Services and People on the World Wide Web. pp. 195- 214 ,(2009) , 10.1007/978-3-642-00570-1_10
Grant McKenzie, Krzysztof Janowicz, Song Gao, Jiue-An Yang, Yingjie Hu, POI Pulse: A Multi-granular, Semantic Signature–Based Information Observatory for the Interactive Visualization of Big Geosocial Data Cartographica: The International Journal for Geographic Information and Geovisualization. ,vol. 50, pp. 71- 85 ,(2015) , 10.3138/CART.50.2.2662
Florian A. Twaroch, Christopher B. Jones, A web platform for the evaluation of vernacular place names in automatically constructed gazetteers geographic information retrieval. pp. 14- ,(2010) , 10.1145/1722080.1722098
Christopher B. Jones, Ross S. Purves, Geographical information retrieval International Journal of Geographical Information Science. ,vol. 22, pp. 219- 228 ,(2008) , 10.1080/13658810701626343