作者: Ying Wang , Huilai Li , Wanli Zuo , Fengling He , Xin Wang
关键词:
摘要: Ontology plays an important role in locating Domain-Specific Deep Web contents, therefore, this paper presents a novel framework WFF for efficiently Web databases based on focused crawling and ontology by constructing Page Classifier(WPC), Form Structure Classifier(FSC) Form Content Classifier(FCC) hierarchical fashion. Firstly, WPC discovers potentially interesting pages on ontology-assisted focused crawler. Then, FSC analyzes the pages determines whether these subsume searchable forms structural characteristics. Lastly, FCC identifies that belong to given domain semantic level, stores URLs of Domain- Specific database. Through detailed experimental evaluation, not only simplifies discovering process, but also effectively databases.