A Critical Review of Migrating Parallel Web Crawler

作者: Md. Faizan Farooqui , Md. Rizwan Beg , Md. Qasim Rafiq

DOI: 10.1007/978-3-642-31552-7_63

关键词:

摘要: The size of the internet is very large and it has grown enormously, search engines are tools for World Wide Web navigation. In order to provide powerful facilities, maintain comprehensive indices documents their contents on by continuously downloading pages processing, known as web crawling. this paper we reviewed various crawlers performance attributes. We study mobile parallel crawling approach that makes system more effective efficient. major advantage analysis portion process done locally where data resides rather than remotely inside engine. This can significantly reduce net- work load which, in turn, improve process. grows, becomes imperative parallelize a process, finish reasonable amount time. identify fundamental issues related migrating also propose metrics evaluate crawler. Lastly, summarize attributes effects

参考文章(17)
Owen Williams, Search Engine Watch De Montfort University, Department of Library Services. ,(2005)
A White, Henry McGilton, James Gosling, The JavaTM Language Environment ,(1998)
David Chess, Colin Harrison, Aaron Kershenbaum, Mobile Agents: Are They a Good Idea? international workshop on mobile object systems. pp. 25- 45 ,(1996) , 10.1007/3-540-62852-5_4
Allan Heydon, Marc Najork, Mercator: A scalable, extensible Web crawler World Wide Web. ,vol. 2, pp. 219- 229 ,(1999) , 10.1023/A:1019213109274
Hyacinth S. Nwana, Software agents: an overview Knowledge Engineering Review. ,vol. 11, pp. 205- 244 ,(1996) , 10.1017/S026988890000789X
Byron Anderson, Archiving the Internet Behavioral & Social Sciences Librarian. ,vol. 23, pp. 113- 117 ,(2005) , 10.1300/J103V23N02_07
Paolo Boldi, Bruno Codenotti, Massimo Santini, Sebastiano Vigna, UbiCrawler: a scalable fully distributed web crawler Software - Practice and Experience. ,vol. 34, pp. 711- 726 ,(2004) , 10.1002/SPE.587
Pattie Maes, Modeling adaptive autonomous agents Artificial Life. ,vol. 1, pp. 135- 162 ,(1993) , 10.1162/ARTL.1993.1.135
O.A. McBryan, GENVL and WWWW: Tools for taming the Web Computer Networks and ISDN Systems. ,vol. 27, pp. 308- ,(1994) , 10.1016/S0169-7552(94)90149-X
Sergey Brin, Lawrence Page, The anatomy of a large-scale hypertextual Web search engine the web conference. ,vol. 30, pp. 107- 117 ,(1998) , 10.1016/S0169-7552(98)00110-X