Language Independent Ranked Retrieval with NeWT

作者: M Yasukawa , F Scholer , J Culpepper

DOI:

关键词:

摘要: In this paper, we present a novel approach to language independent, ranked document retrieval using our new self-index search engine, Newt. To knowledge, is the first experimental study of self-indexing for multilingual Information Retrieval tasks. We evaluate query effectiveness indexes Japanese and English. explore impact that linguistic processing, stemming stopping have on character-aligned indexes, advantages challenges discovered during initial evaluation.

参考文章(33)
Julie Beth Lovins, Development of a Stemming Algorithm Mech. Transl. Comput. Linguistics. ,vol. 11, pp. 22- 31 ,(1968)
Mike Gatford, Micheline Hancock-Beaulieu, Susan Jones, Stephen E. Robertson, Steve Walker, Okapi at TREC text retrieval conference. pp. 109- 123 ,(1994)
Hinrich Schütze, Christopher D. Manning, Prabhakar Raghavan, Introduction to Information Retrieval ,(2005)
W. B. Frakes, Term conflation for information retrieval international acm sigir conference on research and development in information retrieval. pp. 383- 389 ,(1984) , 10.5555/636805.636830
Ellen M. Voorhees, Donna Harman, Overview of the Eighth Text REtrieval Conference (TREC-8). text retrieval conference. ,(1999)
J. Shane Culpepper, Gonzalo Navarro, Simon J. Puglisi, Andrew Turpin, Top-k ranked document search in general text databases european symposium on algorithms. pp. 194- 205 ,(2010) , 10.1007/978-3-642-15781-3_17
David A. Hull, Stemming algorithms: a case study for detailed evaluation Journal of the Association for Information Science and Technology. ,vol. 47, pp. 70- 84 ,(1996) , 10.1002/(SICI)1097-4571(199601)47:1<70::AID-ASI7>3.3.CO;2-Q
Donna Harman, How effective is suffixing Journal of the Association for Information Science and Technology. ,vol. 42, pp. 7- 15 ,(1991) , 10.1002/(SICI)1097-4571(199101)42:1<7::AID-ASI2>3.0.CO;2-P
JOHN E. BURNETT, DAVID COOPER, MICHAEL F. LYNCH, PETER WILLETT, MAUREEN WYCHERLEY, DOCUMENT RETRIEVAL EXPERIMENTS USING INDEXING VOCABULARIES OF VARYING SIZE. I. VARIETY GENERATION SYMBOLS ASSIGNED TO THE FRONTS OF INDEX TERMS Journal of Documentation. ,vol. 35, pp. 197- 206 ,(1979) , 10.1108/EB026680
Paul McNamee, Charles Nicholas, James Mayfield, Don't have a stemmer? Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08. pp. 813- 814 ,(2008) , 10.1145/1390334.1390518