Inverted indices in information extraction to improve records extracted per annotation

作者: Mahesh Tiyyagura

DOI:

关键词:

摘要: A method is provided for information extraction from among a multiplicity of documents each having corresponding document object model (DOM) comprising: computing signatures associated with nodes DOMs to the documents; producing an index that associates computed has DOM one or more such signature; annotating corresponds at least selected document; wherein annotated respectively correspond respective included in index; and matching determine which have nodes.

参考文章(15)
Nicholas Kushmerick, Daniel S. Weld, Wrapper induction for information extraction international joint conference on artificial intelligence. pp. 729- 737 ,(1997)
Olcan Sercinoglu, Benedict Anthony Gomes, Jeffrey Dean, Gautham K. Thambidorai, Sanjay Ghemawat, Document compression system and method for use with tokenspace repository ,(2004)
S. Mukherjee, Guizhen Yang, Wenfang Tan, I.V. Ramakrishnan, Automatic discovery of semantic structures in HTML documents international conference on document analysis and recognition. pp. 245- 249 ,(2003) , 10.1109/ICDAR.2003.1227667
Mingcai Hong, Jie Tang, Juanzi Li, Semantic Annotation Using Horizontal and Vertical Contexts The Semantic Web – ASWC 2006. pp. 58- 64 ,(2006) , 10.1007/11836025_6
Sanjay M. Krishnamurthy, Indexing XML documents efficiently ,(2004)
Natasa Milic-Frayling, Ralph Sommerer, User interface for a resource search tool ,(2003)
Lawrence Reeve, Hyoil Han, Survey of semantic annotation platforms Proceedings of the 2005 ACM symposium on Applied computing - SAC '05. pp. 1634- 1638 ,(2005) , 10.1145/1066677.1067049
Phillip G. Rorex, Bradley R. Haugaard, Thomas A. Soulanille, Method and apparatus for identifying related searches in a database search system ,(2001)