Cost-effective web search in bootstrapping for named entity recognition

作者: Hideki Kawai , Hironori Mizuguchi , Masaaki Tsuchida

DOI: 10.1007/978-3-540-78568-2_29

关键词: Computer scienceSearch engineWeb APIBootstrapping (linguistics)Semantic searchWeb applicationSearch analyticsNamed-entity recognitionInformation retrievalWeb search query

摘要: In this paper, we propose a cost-effective search strategy framework to extract keywords in the same semantic class from Web. Constructing dictionary based on bootstrapping technique is one promising approach harnessing knowledge scattered around Open web application programming interfaces (APIs) are powerful tools for knowledge-gathering process. However, have consider cost of API calls because too many queries can overload engines, and they also limit number calls. Our goal optimize that collect as new words possible with least results show optimized 64,642 five different domains precision 0.94 only 1,000

参考文章(20)
Marius Pasca, Alpa Jain, Jeffrey Bigham, Dekang Lin, Andrei Lifchits, Organizing and searching the world wide web of facts - step one: the one-million fact extraction challenge national conference on artificial intelligence. pp. 1400- 1405 ,(2006)
Ellen Riloff, Rosie Jones, Learning dictionaries for information extraction by multi-level bootstrapping national conference on artificial intelligence. pp. 474- 479 ,(1999)
Sergey Brin, Extracting Patterns and Relations from the World Wide Web Lecture Notes in Computer Science. pp. 172- 183 ,(1999) , 10.1007/10704656_11
Kevin Chen-Chuan Chang, ChengXiang Zhai, Shui-Lung Chuang, Context-aware wrapping: synchronized data extraction very large data bases. pp. 699- 710 ,(2007)
Wendy Lehnert, Jonathan Aseltine, David Fisher, Stephen Soderland, CRYSTAL inducing a conceptual dictionary international joint conference on artificial intelligence. pp. 1314- 1319 ,(1995)
Kristie Seymore, Andrew McCallum, Roni Rosenfeld, Learning Hidden Markov Model Structure for Information Extraction ,(1999)
Oren Etzioni, Michael Cafarella, Doug Downey, Stanley Kok, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, Alexander Yates, Web-scale information extraction in knowitall Proceedings of the 13th conference on World Wide Web - WWW '04. pp. 100- 110 ,(2004) , 10.1145/988672.988687