Extracting evolution of web communities from a series of web archives

作者: Masashi Toyoda , Masaru Kitsuregawa

DOI: 10.1145/900051.900059

关键词:

摘要: Recent advances in storage technology make it possible to store a series of large Web archives. It is now an exciting challenge for us observe evolution the Web. In this paper, we propose method observing web communities. A community set pages created by individuals or associations with common interest on topic. So far, various link analysis techniques have been developed extract We analyze communities comparing four Japanese archives crawled from 1999 2002. Statistics these and are examined, global behavior described. Several metrics introduced measure degree evolution, such as growth rate, novelty, stability. system extracting detailed using metrics. allows understand when how emerged evolved. Some examples shown our system.

参考文章(16)
Hector Garcia-Molina, Junghoo Cho, The Evolution of the Web and Implications for an Incremental Crawler very large data bases. pp. 200- 209 ,(2000)
Krishna Bharat, Andrei Broder, Monika Henzinger, Puneet Kumar, Suresh Venkatasubramanian, The connectivity server: fast access to linkage information on the Web the web conference. ,vol. 30, pp. 469- 477 ,(1998) , 10.1016/S0169-7552(98)80047-0
Jon M. Kleinberg, Authoritative sources in a hyperlinked environment symposium on discrete algorithms. pp. 668- 677 ,(1998) , 10.5555/314613.315045
Gary William Flake, Steve Lawrence, C. Lee Giles, Efficient identification of Web communities knowledge discovery and data mining. pp. 150- 160 ,(2000) , 10.1145/347090.347121
Jeffrey Dean, Monika R Henzinger, Finding related pages in the World Wide Web the web conference. ,vol. 31, pp. 1467- 1479 ,(1999) , 10.1016/S1389-1286(99)00022-5
Soumen Chakrabarti, Byron Dom, Prabhakar Raghavan, Sridhar Rajagopalan, David Gibson, Jon Kleinberg, Automatic resource compilation by analyzing hyperlink structure and associated text the web conference. ,vol. 30, pp. 65- 74 ,(1998) , 10.1016/S0169-7552(98)00087-7
David Gibson, Jon Kleinberg, Prabhakar Raghavan, Inferring Web communities from link topology acm conference on hypertext. pp. 225- 234 ,(1998) , 10.1145/276627.276652
Brian E. Brewington, George Cybenko, How dynamic is the Web the web conference. ,vol. 33, pp. 257- 276 ,(2000) , 10.1016/S1389-1286(00)00045-1
Krishna Bharat, Monika R. Henzinger, Improved algorithms for topic distillation in a hyperlinked environment international acm sigir conference on research and development in information retrieval. ,vol. 51, pp. 104- 111 ,(1998) , 10.1145/3130348.3130367
Masashi Toyoda, Masaru Kitsuregawa, Creating a Web community chart for navigating related communities acm conference on hypertext. pp. 103- 112 ,(2001) , 10.1145/504216.504244