作者: Zhaomeng Peng , Nengqiang He , Chunxiao Jiang , Zhihua Li , Lei Xu
关键词:
摘要: AJAX (Asynchronous JavaScript and XML) is becoming more popular with the prosperity of web 2.0. However, traditional crawlers fail to retrieve information from applications because complex operations. Moreover, a single application one URL may have different page states, which violates rule that corresponds unique page. The can be modeled as state transition graph crawl traverse without prior knowledge its structure. In this paper, we distinguished events are not well defined in previous work proposed Graph-based State Traversal (GAST) algorithm minimal edge visits. If topology given, optimization problem turns into Directed Rural Postman Problem (DRPP) optimal lower bound obtained. Experimental results show approaches optimum exhibits better performance than existing work.