摘要: Topical crawling is a young and creative area of research that holds the promise benefiting from several sophisticated data mining techniques. The use classification algorithms to guide topical crawlers has been sporadically suggested in literature. No systematic study, however, done on their relative merits. Using lessons learned our previous crawler evaluation studies, we experiment with multiple versions different schemes. process modeled as parallel best-first search over graph defined by Web. classifiers provide heuristics thus biasing it towards certain portions Web graph. Our results show Naive Bayes weak choice for guiding when compared Support Vector Machine or Neural Network. Further, performance can be partly explained extreme skewness posterior probabilities generated it. We also observe despite similar performances, cover subspaces low overlap.