作者: Raymond P. Stata , Janet L. Wiener , Michael Burrows
DOI:
关键词:
摘要: A connectivity server for a collecting, arranging and representing data defining the interconnection of pages on World Wide Web (Web). URL Database stores URLs associates fingerprint CS_id with each URL. The interface is operable to translate between any two URL, fingerprint, Host_id. Host Host_id distinct hostname in Database. accept return number equal respective host CS_ids those URLs. Link links source destination retrieve, given CS_id, inlinks outlinks from corresponding CS_id. In an embodiment characterized by single processor, access all databases stored RAM information may be retrieved sufficiently rapidly so that applications touch every link, even multiple times, execute real time, few minutes or hours. Representative enabled include static ranking (eigenranks), query precomputation, mirror-site detection, related-page identification.