作者: Simon Jonassen , Svein Erik Bratsberg
DOI: 10.1007/978-3-642-35063-4_1
关键词: Computer science 、 Scalability 、 Query optimization 、 Search engine 、 Parallel computing 、 Query expansion 、 Inverted index 、 Theoretical computer science
摘要: Web search engines need to provide high throughput and short query latency. Recent results show that pipelined processing over a term-wise partitioned inverted index may have superior throughput. However, the latency scalability with respect collections size are main challenges associated this method. In paper, we evaluate effect of skipping on performance processing. Further, introduce novel idea using Max-Score pruning within new term assignment heuristic, partitioning by Max-Score. Our current indicate significant improvement state-of-the-art approach lead several further optimizations, which include dynamic load balancing, intra-query concurrent hybrid combination between non-pipelined execution.