作者: Jerome Robinson
DOI:
关键词:
摘要: Much useful e-commerce information is available on web pages, especially those created by queries to servers. The problem for programs use that how ‘screen-scrape’ the data off page into machineusable structures. Wrappers sources knowledge of layout in order extract accurately. So they fail if format changes. This paper describes a fast method wrapper production and also automatically detect change, before it causes access fail. works pages contain collections items, such as lists, tables hierarchical It uses representation html documents, which makes repetitive features apparent. provides fully automatic class rapid interactive others.