Indexing Rich Internet Applications Using Components-Based Crawling

作者： Ali Moosavi , Salman Hooshmand , Sara Baghbanzadeh , Guy-Vincent Jourdan , Gregor V. Bochmann

关键词: Web crawler 、 Component (UML) 、 Ajax 、 JavaScript 、 Finite-state machine 、 Rich Internet application 、 Computer science 、 Search engine indexing 、 Crawling 、 Real-time computing

摘要: Automatic crawling of Rich Internet Applications (RIAs) is a challenge because client-side code modifies the client dynamically, fetching server-side data asynchronously. Most existing solutions model RIAs as state machines with DOMs states and JavaScript events execution transitions. This approach fails when used “real-life”, complex RIAs, size produced much too large to be practical. In this paper, we propose new method crawl AJAX-based in an efficient manner by detecting “components”, which are areas DOM that independent from each other, component separately. leads dramatic reduction required space for model, without loss content coverage. Our does not require prior knowledge RIA nor predefined definition components. Instead, infer components observing behavior during crawling. experimental results show our can index quickly completely industrial simply out reach traditional methods.

springer.com PDF 下载加速

sci-hub.st HTML 下载加速

参考文章(27)

Iosif Viorel Onut, Gregor von Bochmann, Mustafa Emre Dincturk, Seyed M. Mirtaheri, Suryakant Choudhary, Guy-Vincent Jourdan, Ali Moosavi, Crawling rich internet applications: the state of the art conference of the centre for advanced studies on collaborative research. pp. 146- 160 ,(2012)

Salman Hooshmand, Iosif Viorel Onut, Gregor V. Bochmann, Mustafa Emre Dinçtürk, Seyed M. Mirtaheri, Guy-Vincent Jourdan, A brief history of web crawlers conference of the centre for advanced studies on collaborative research. pp. 40- 54 ,(2013)

Piero Fraternali, Gustavo Rossi, Fernando Sánchez-Figueroa, Rich Internet Applications IEEE Internet Computing. ,vol. 14, pp. 9- 12 ,(2010) , 10.1109/MIC.2010.76

Alex Q. Chen, Widget identification and modification for web 2.0 access technologies (WIMWAT) ACM Sigaccess Accessibility and Computing. ,vol. 96, pp. 11- 18 ,(2010) , 10.1145/1731849.1731851

Zhaomeng Peng, Nengqiang He, Chunxiao Jiang, Zhihua Li, Lei Xu, Yipeng Li, Yong Ren, Graph-Based AJAX Crawl: Mining Data from Rich Internet Applications international conference on computer science and electronics engineering. ,vol. 3, pp. 590- 594 ,(2012) , 10.1109/ICCSEE.2012.38

Domenico Amalfitano, Anna Rita Fasolino, Armando Polcaro, Porfirio Tramontana, The DynaRIA tool for the comprehension of Ajax web applications by dynamic analysis Innovations in Systems and Software Engineering. ,vol. 10, pp. 41- 57 ,(2014) , 10.1007/S11334-013-0207-X

Mustafa Emre Dincturk, Guy-Vincent Jourdan, Gregor V. Bochmann, Iosif Viorel Onut, A Model-Based Approach for Crawling Rich Internet Applications ACM Transactions on The Web. ,vol. 8, pp. 19- ,(2014) , 10.1145/2626371

Amin Milani Fard, Ali Mesbah, Feedback-directed exploration of web applications to derive test models international symposium on software reliability engineering. pp. 278- 287 ,(2013) , 10.1109/ISSRE.2013.6698880

Cor-Paul Bezemer, Ali Mesbah, Arie van Deursen, Automated security testing of web widget interactions foundations of software engineering. pp. 81- 90 ,(2009) , 10.1145/1595696.1595711

10.

Iyad Abu Doush, Faisal Alkhateeb, Eslam Al Maghayreh, Mohammed Azmi Al-Betar, The design of RIA accessibility evaluation tool Advances in Engineering Software. ,vol. 57, pp. 1- 7 ,(2013) , 10.1016/J.ADVENGSOFT.2012.11.004

Indexing Rich Internet Applications Using Components-Based Crawling

来源期刊

我的账户

Indexing Rich Internet Applications Using Components-Based Crawling

来源期刊

相似文章 6

The Reconstruction of User-Interactions from HTTP Traces for RIAs

Directed test generation and analysis for web applications

GUIDE: an interactive and incremental approach for crawling Web applications

Distributed Component-Based Crawler for AJAX Applications

Locality-Sensitive Hashing for Efficient Web Application Security Testing

Empirical evaluation of the link and content-based focused Treasure-Crawler

我的账户