Characteristics of Open Data CSV Files

作者: Johann Mitlohner , Sebastian Neumaier , Jurgen Umbrich , Axel Polleres

DOI: 10.1109/OBD.2016.18

关键词:

摘要: … The Python standard CSV library also uses a single whitespace character as a possible delimiter. We excluded this choice since it is an extremely uncommon delimiter with an high rate …

参考文章(9)
Jingjing Wang, Haixun Wang, Zhongyuan Wang, Kenny Q. Zhu, Understanding tables on the web international conference on conceptual modeling. pp. 141- 155 ,(2012) , 10.1007/978-3-642-34002-4_11
Jurgen Umbrich, Sebastian Neumaier, Axel Polleres, Quality Assessment and Evolution of Open Data Portals conference on the future of the internet. pp. 404- 411 ,(2015) , 10.1109/FICLOUD.2015.82
Dominique Ritze, Oliver Lehmberg, Christian Bizer, Matching HTML Tables to DBpedia web intelligence, mining and semantics. pp. 10- ,(2015) , 10.1145/2797115.2797118
Ivan Ermilov, Sören Auer, Claus Stadler, User-driven semantic mapping of tabular data Proceedings of the 9th International Conference on Semantic Systems - I-SEMANTICS '13. pp. 105- 112 ,(2013) , 10.1145/2506182.2506196
George A. Miller, WordNet Communications of the ACM. ,vol. 38, pp. 39- 41 ,(1995) , 10.1145/219717.219748
Eric Crestan, Patrick Pantel, Web-scale table census and classification web search and data mining. pp. 545- 554 ,(2011) , 10.1145/1935826.1935904
Yakov Shafranovich, Common Format and MIME Type for Comma-Separated Values (CSV) Files RFC. ,vol. 4180, pp. 1- 8 ,(2005)
Dominique Ritze, Oliver Lehmberg, Yaser Oulabi, Christian Bizer, Profiling the Potential of Web Tables for Augmenting Cross-domain Knowledge Bases the web conference. pp. 251- 261 ,(2016) , 10.1145/2872427.2883017
Mariano Rodriguez-Muro, Oktie Hassanzadeh, Kavitha Srinivas, Michael Jeffrey Ward, Understanding a large corpus of web tables through matching with knowledge bases: an empirical study. OM. pp. 25- 34 ,(2015)