Resource-Bounded information extraction: acquiring missing feature values on demand

作者: Pallika Kanani , Andrew McCallum , Shaohan Hu

DOI: 10.1007/978-3-642-13657-3_45

关键词: Resource (project management)Missing dataFeature (computer vision)Information extractionTask (project management)Specific-informationProbabilistic logicInformation retrievalComputer scienceData mining

摘要: We present a general framework for the task of extracting specific information “on demand” from large corpus such as Web under resource-constraints. Given database with missing or uncertain information, proposed system automatically formulates queries, issues them to search interface, selects subset documents, extracts required them, and fills values in original database. also exploit inherent dependency within data obtain useful fewer computational resources. build citation domain that publication years using limited resources Web. discuss probabilistic approach this first results. The main contribution paper is propose general, comprehensive architecture designing adaptable different domains.

参考文章(16)
Jerry Fowler, Brad Perry, Marian H. Nodine, Active Information Gathering in InfoSleuth. CODAS. pp. 15- 26 ,(1999)
Carlos Guestrin, Andreas Krause, Near-optimal nonmyopic value of information in graphical models uncertainty in artificial intelligence. pp. 324- 331 ,(2005)
Andrew McCallum, Pallika Kanani, Chris Pal, Improving author coreference by resource-bounded information gathering from the web international joint conference on artificial intelligence. pp. 429- 434 ,(2007)
Pallika Kanani, Andrew McCallum, Resource-bounded information gathering for correlation clustering conference on learning theory. ,vol. 4539, pp. 625- 627 ,(2007) , 10.1007/978-3-540-72927-3_46
Lise Getoor, Mustafa Bilgic, VOILA: efficient feature-value acquisition for classification national conference on artificial intelligence. pp. 1225- 1230 ,(2007)
Jimmy Lin, Aaron Fernandes, Boris Katz, Gregory Marton, Stefanie Tellex, Extracting Answers from the Web Using Knowledge Annotation and Knowledge Mining Techniques Defense Technical Information Center. ,(2006) , 10.21236/ADA456267
Shubin Zhao, Jonathan Betz, Corroborate and learn facts from the web Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '07. pp. 995- 1003 ,(2007) , 10.1145/1281192.1281299
Aaron Fernandes, Boris Katz, Jimmy J. Lin, Gregory Marton, Stefanie Tellex, Extracting Answers from the Web Using Data Annotation and Knowledge Mining Techniques. text retrieval conference. ,(2002)
Ralph Grishman, Beth Sundheim, Message Understanding Conference-6: a brief history international conference on computational linguistics. ,vol. 1, pp. 466- 471 ,(1996) , 10.3115/992628.992709
Victor S. Sheng, Charles X. Ling, Feature value acquisition in testing Proceedings of the 23rd international conference on Machine learning - ICML '06. pp. 809- 816 ,(2006) , 10.1145/1143844.1143946