摘要: Traversal-based approaches to execute queries over data on the Web have recently been studied. These make use of up-to-date from initially unknown sources and, thus, enable applications tap full potential Web. While existing work focuses primarily implementation techniques, a principled analysis subwebs that are reachable by such is missing. Such an may help gain new insight into problem optimizing response time traversal-based query engines. Furthermore, better understanding characteristics also inform benchmark these This paper provides analysis. In particular, we identify typical graph-based properties query-specific and quantify their diversity. investigate whether vertex scoring methods (e.g., PageRank) able predict query-relevance when applied subwebs.