作者: Hai He , Weiyi Meng , Yiyao Lu , Clement Yu , Zonghuan Wu
DOI: 10.1007/S11280-006-0010-9
关键词:
摘要: Many databases have become Web-accessible through form-based search interfaces (i.e., HTML forms) that allow users to specify complex and precise queries access the underlying databases. In general, such a Web interface can be considered as containing an schema with multiple attributes rich semantic/meta-information; however, is not formally defined in HTML. applications, database integration deep crawling, require construction of schemas. this paper, we first propose model for representing interfaces, then present layout-expression based approach automatically extract logical from interfaces. We also rephrase identification different types semantic information classification problem, design several Bayesian classifiers help derive extracted attributes. A system, WISE-iExtractor, has been implemented construct any Our experimental results on real indicate system highly effective.