Exposing the hidden web for chemical digital libraries

作者: Sascha Tönnies , Benjamin Köhncke , Oliver Koepler , Wolf-Tilo Balke

DOI: 10.1145/1816123.1816159

关键词:

摘要: In recent years, the vast amount of digitally available content has lead to creation many topic-centered digital libraries. Also in domain chemistry more and collections are available, but complex query formulation still hampers their intuitive adoption. This is because information seeking chemical documents focused on entities, for which current standard search relies structures hard extract from documents. Moreover, although simple keyword searches would often be sufficient, simply cannot indexed by Web providers due ambiguity substance names. this paper we present a framework automatically generating metadata-enriched index pages all given collection. All then linked respective thus provides an easy crawl metadata repository promising open up Our experiments, indexing access journal, show that not only can found using Google via created pages, also quality much efficient than fulltext terms both precision/recall performance. Finally, compare our against classical structure figured out keyword-based indeed solve at least some daily tasks workflows. To use promises expose large part currently hidden Web, making techniques employed interesting like libraries journals.

参考文章(18)
Elbert G. Smith, William J. Wiswesser, The Wiswesser line-formula chemical notation McGraw-Hill. ,(1968)
Peter Corbett, Peter Murray-Rust, High-Throughput Identification of Chemistry in Life Science Texts Computational Life Sciences II. pp. 107- 118 ,(2006) , 10.1007/11875741_11
Igor V. Filippov, Marc C. Nicklaus, Optical Structure Recognition Software To Recover Chemical Information: OSRA — An Open Source Solution Journal of Chemical Information and Modeling. ,vol. 49, pp. 740- 743 ,(2009) , 10.1021/CI800067R
David Weininger, SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules Journal of Chemical Information and Computer Sciences. ,vol. 28, pp. 31- 36 ,(1988) , 10.1021/CI00057A005
Maria Liakata, Claire Q, Larisa N. Soldatova, Semantic Annotation of Papers: Interface & Enrichment Tool (SAPIENT) north american chapter of the association for computational linguistics. pp. 193- 200 ,(2009) , 10.3115/1572364.1572391
D. J. Gluck, A Chemical Structure Storage and Search System Developed at Du Pont. Journal of Chemical Documentation. ,vol. 5, pp. 43- 51 ,(1965) , 10.1021/C160016A008
Aniko T. Valko, A. Peter Johnson, CLiDE Pro: the latest generation of CLiDE, a tool for optical chemical structure recognition. Journal of Chemical Information and Modeling. ,vol. 49, pp. 780- 787 ,(2009) , 10.1021/CI800449T
Simone Teufel, Jean Carletta, Marc Moens, An annotation scheme for discourse-level argumentation in research articles Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics -. pp. 110- 117 ,(1999) , 10.3115/977035.977051
Joe R. McDaniel, Jason R. Balmuth, Kekule: OCR-optical chemical (structure) recognition Journal of Chemical Information and Computer Sciences. ,vol. 32, pp. 373- 378 ,(1992) , 10.1021/CI00008A018