A combined component approach for finding collection-adapted ranking functions based on genetic programming

作者: Humberto Mossri de Almeida , Marcos André Gonçalves , Marco Cristo , Pável Calado

DOI: 10.1145/1277741.1277810

关键词:

摘要: In this paper, we propose a new method to discover collection-adapted ranking functions based on Genetic Programming (GP). Our Combined Component Approach (CCA)is the combination of several term-weighting components (i.e.,term frequency, collection normalization) extracted from well-known functions. contrast related work, GP terminals in our CCA are not simple statistical information document collection, but meaningful, effective, and proven components. Experimental results show that approach was able outper form standard TF-IDF, BM25 another GP-based two different collections. obtained improvements mean average precision up 40.87% for TREC-8 24.85% WBR99 (a large Brazilian Web collection), over baseline The evolution process also reduce overtraining, commonly found machine learning methods, especially genetic programming, converge faster than other used comparison.

参考文章(27)
The effects of fitness functions on genetic programming-based ranking discovery for Web search: Research Articles Journal of the Association for Information Science and Technology. ,vol. 55, pp. 628- 636 ,(2004) , 10.1002/ASI.V55:7
Mike Gatford, Micheline Hancock-Beaulieu, Susan Jones, Stephen E. Robertson, Steve Walker, Okapi at TREC text retrieval conference. pp. 109- 123 ,(1994)
WEIGUO FAN, MICHAEL D. GORDON, PRAVEEN PATHAK, PRAVEEN PATHAK, Genetic Programming-Based Discovery of Ranking Functions for Effective Web Search Journal of Management Information Systems. ,vol. 21, pp. 37- 56 ,(2005) , 10.1080/07421222.2005.11045828
Michael D. Gordon, Weiguo Fan, Praveen Pathak, Personalization of search engine services for effective retrieval and knowledge management international conference on information systems. pp. 20- 34 ,(2000) , 10.5555/359640.359720
Ellen M. Voorhees, Donna Harman, Overview of the Eighth Text REtrieval Conference (TREC-8). text retrieval conference. ,(1999)
G. Salton, M. Mitra, A. Singhal, C. Buckley, New Retrieval Approaches Using SMART: TREC 4. text retrieval conference. pp. 25- 48 ,(1995)
Stephen E. Robertson, Steve Walker, Okapi/Keenbow at TREC-8. text retrieval conference. pp. 151- 162 ,(1999)
Nir Oren, Reexamining tf.idf based information retrieval with genetic programming south african institute of computer scientists and information technologists. pp. 224- 234 ,(2002) , 10.5555/581506.581538