作者: Milad Shokouhi , Luo Si
DOI:
关键词: Federated search 、 World Wide Web 、 Search engine 、 Parallel search 、 Index (publishing) 、 Information retrieval 、 Deep Web 、 Collection selection 、 Crawling 、 Computer science
摘要: Federated search (federated information retrieval or distributed retrieval) is a technique for searching multiple text collections simultaneously. Queries are submitted to subset of that most likely return relevant answers. The results returned by selected integrated and merged into single list. preferred over centralized alternatives in many environments. For example, commercial engines such as Google cannot easily index uncrawlable hidden web while federated systems can the contents without crawling. In enterprise environments, where each organization maintains an independent engine, techniques provide parallel collections. There three major challenges search. query, documents selected. This creates collection selection problem. To be able select suitable collections, need acquire some knowledge about collection, creating representation from before final presentation user. step result merging problem. The goal this work, comprehensive summary previous research on described above.