作者: Mahesh Tiyyagura
DOI:
关键词:
摘要: A method is provided for information extraction from among a multiplicity of documents each having corresponding document object model (DOM) comprising: computing signatures associated with nodes DOMs to the documents; producing an index that associates computed has DOM one or more such signature; annotating corresponds at least selected document; wherein annotated respectively correspond respective included in index; and matching determine which have nodes.