Document Informatics for Scientific Learning and Accelerated Discovery

作者: Venu Govindaraju , Ifeoma Nwogu , Srirangaraj Setlur

DOI: 10.1016/B978-0-444-63492-4.00001-0

关键词: Materials informaticsField (computer science)InterdisciplinarityDeep belief networkInteractive visualizationComputer scienceData scienceDynamic topic modelVisualizationInformatics

摘要: Abstract This chapter presents a concept paper that describes methods to accelerate new materials discovery and optimization, by enabling faster recognition use of important theoretical, computational, experimental information aggregated from peer-reviewed published materials-related scientific documents online. To obtain insights for the study about existing materials, research development scientists engineers rely heavily on an ever-growing number publications, mostly available online, date back many decades. So, major thrust this is technology (i) extract “deep” meaning large corpus relevant science documents; (ii) navigate, cluster, present in meaningful way; (iii) evaluate revise query responses until researchers are guided their destination. While proposed methodology targets interdisciplinary field research, tools be developed can generalized enhance discoveries learning across broad swathe disciplines. The will advance machine-learning area developing hierarchical, dynamic topic models investigate trends over user-specified time periods. Also, image-based document analysis benefit tremendously machine such as deep belief networks classification text separation images. Developing interactive visualization tool display modeling results network perspective well time-based advancement studies.

参考文章(61)
Joshua O’Madadhain, Danyel Fisher, Padhraic Smyth, Yan-Biao Boey, Analysis and Visualization of Network Data using JUNG ,(2005)
Dorothea Blostein, Edward Lank, Richard Zanibbi, Treatment of Diagrams in Document Image Analysis Lecture Notes in Computer Science. pp. 330- 344 ,(2000) , 10.1007/3-540-44590-0_29
A. Balasubramanian, Million Meshesha, C. V. Jawahar, Retrieval from document image collections document analysis systems. pp. 1- 12 ,(2006) , 10.1007/11669487_1
R. Manmatha, Shaolei Feng, Statistical models for text query-based image retrieval University of Massachusetts Amherst. ,(2008)
Wei Jin, Rohini K. Srihari, Xin Wu, Mining Concept Associations for Knowledge Discovery Through Concept Chain Queries Advances in Knowledge Discovery and Data Mining. pp. 555- 562 ,(2010) , 10.1007/978-3-540-71701-0_58
David M Blei, Andrew Y Ng, Michael I Jordan, None, Latent dirichlet allocation Journal of Machine Learning Research. ,vol. 3, pp. 993- 1022 ,(2003) , 10.5555/944919.944937
Carl Lagoze, James R. Davis, Dienst: an architecture for distributed document libraries Communications of The ACM. ,vol. 38, pp. 47- ,(1995) , 10.1145/205323.205331
Arif E. Jinha, Article 50 million: an estimate of the number of scholarly articles in existence Learned Publishing. ,vol. 23, pp. 258- 263 ,(2010) , 10.1087/20100308