The Sindice-2011 Dataset for Entity-Oriented Search in the Web of Data

作者: Giovanni Tummarello , Renaud Delbru , Stéphane Campinas , Krisztian Balog , Diego Ceccarelli

DOI:

关键词:

摘要: The task of entity retrieval becomes increasingly prevalent as more and (semi-) structured information about objects is available on the Web in form documents embedding metadata (RDF, RDFa, Microformats, others). However, research development that direction dependent (1) availability a representative corpus entities are found Web, (2) an entity-oriented search infrastructure for experimenting with new models. In this paper, we introduce Sindice-2011 data collection which derived from collected by Sindice semantic engine. (available at http://data.sindice.com/trec2011/) especially designed supporting domain web retrieval. We describe how organised, discuss statistics collection, to foster development.

参考文章(10)
Nick Craswell, Ian Soboroff, Arjen P. de Vries, Overview of the TREC-2005 Enterprise Track text retrieval conference. ,(2005)
Serge Abiteboul, Querying Semi-Structured Data international conference on database theory. pp. 1- 18 ,(1997) , 10.1007/3-540-62222-5_33
Jeffrey Pound, Peter Mika, Hugo Zaragoza, Ad-hoc object retrieval in the web of data the web conference. pp. 771- 780 ,(2010) , 10.1145/1772690.1772769
Eyal Oren, Renaud Delbru, Michele Catasta, Richard Cyganiak, Holger Stenzhorn, Giovanni Tummarello, Sindice.com: a document-oriented lookup index for open linked data International Journal of Metadata, Semantics and Ontologies. ,vol. 3, pp. 37- 52 ,(2008) , 10.1504/IJMSO.2008.021204
Krisztian Balog, Pavel Serdyukov, Paul Thomas, Thijs Westerveld, Arjen P. de Vries, Overview of the TREC 2009 Entity Track text retrieval conference. ,(2009)
Renaud Delbru, Stephane Campinas, Giovanni Tummarello, Searching web data: An entity retrieval and high-performance indexing model Journal of Web Semantics. ,vol. 10, pp. 33- 58 ,(2012) , 10.1016/J.WEBSEM.2011.04.004
Arjen P. de Vries, Anne-Marie Vercoustre, James A. Thom, Nick Craswell, Mounia Lalmas, Overview of the INEX 2007 Entity Ranking Track Focused Access to XML Documents. pp. 245- 251 ,(2008) , 10.1007/978-3-540-85902-4_22
Jeffrey Dean, Sanjay Ghemawat, MapReduce Communications of the ACM. ,vol. 51, pp. 107- 113 ,(2008) , 10.1145/1327452.1327492
Harry Halpin, Duc Thanh Tran, Roi Blanco, Daniel M. Herzig, Jeffrey Pound, Peter Mika, Henry Thompon, Evaluating Ad-Hoc Object Retrieval International Workshop on Evaluation of Semantic Technologies. ,(2010)
Krisztian Balog, Pavel Serdyukov, Arjen P. de Vries, Overview of the TREC 2010 Entity Track text retrieval conference. ,(2010)