SemTag and seeker

作者: Stephen Dill , John A. Tomlin , Jason Y. Zien , Nadav Eiron , David Gibson

DOI: 10.1145/775152.775178

关键词:

摘要: This paper describes Seeker, a platform for large-scale text analytics, and SemTag, an application written on the to perform automated semantic tagging of large corpora. We apply SemTag collection approximately 264 million web pages, generate 434 automatically disambiguated tags, published as label bureau providing metadata regarding annotations. To our knowledge, this is largest scale effort date.We describe Seeker platform, discuss architecture application, new disambiguation algorithm specialized support ontological data, evaluate algorithm, present final results with information about acquiring making use tags. argue that ambiguous content can bootstrap accelerate creation web.

参考文章(33)
Ora Lassila, Tim Berners-lee, James A. Hendler, The Semantic Web" in Scientific American ,(2001)
P. Buitelaart, B. Boguraev, J. Pustejovsky, M. Verhagen, M. Johnston, Semantic Indexing and Typed Hyperlinking ,(1997)
Deborah L. McGuinness, Description Logics Emerge from Ivory Towers. Description Logics. ,(2001)
W. Cohen, A structured wrapper induction system for extracting information from semi-structured documents international joint conference on artificial intelligence. ,(2001)
Yong Yu, Lei Zhang, Jianming Li, Learning to Generate Semantic Annotation for Domain Specific Sentences. international conference on knowledge capture. ,(2001)
Forbes J. Burkowski, Charles L. A. Clarke, Gordon V. Cormack, Shortest substring ranking (MultiText experiments for TREC-4) text retrieval conference. pp. 295- 304 ,(1995)
Hugh Glaser, Thomas Leonard, Large scale acquisition and maintenance from the web without source access international conference on knowledge capture. ,(2001)
Jeff Heflin, James Hendler, Searching the Web with SHOE Defense Technical Information Center. ,(2000) , 10.21236/ADA440405
Nicholas Kushmerick, Daniel S. Weld, Wrapper induction for information extraction international joint conference on artificial intelligence. pp. 729- 737 ,(1997)
Alexander Maedche, Steffen Staab, Siegfried Handschuh, An annotation framework for the semantic web ,(2001)