Analysing and Navigating Natural Products Space for Generating Small, Diverse, But Representative Chemical Libraries.

作者: Steve O’Hagan , Douglas B. Kell

DOI: 10.1002/BIOT.201700503

关键词:

摘要: Armed with the digital availability of two natural products libraries, amounting to some 195 885 molecular entities, we ask question how can best sample from them maximize their "representativeness" in smaller and more usable libraries 96, 384, 1152, 1920 molecules. The term is intended include diversity, but for numerical reasons (and likelihood being able perform a QSAR) it necessary focus on areas chemical space that are highly populated. Encoding structures as fingerprints using RDKit "patterned" algorithm, first assess granularity simple clustering showing there major regions "denseness" also great many very sparsely populated areas. We then apply "hybrid" hierarchical K-means algorithm data produce statistically robust clusters which representative appropriate numbers samples may be chosen. There necessarily again trade-off between cluster size number, within these constraints, containing 384 or 1152 molecules found come represent 18 30% whole space, sizes of, respectively, 50 27 above, just about sufficient QSAR. By online via Molport system (www.molport.com), construct (and, time, provide contents of) small virtual library available provided effective coverage described. Consistent this, average similarities developed considerably than original libraries. suggested have use phenotypic screening, including determining possible transporter substrates.

参考文章(119)
G Harper, S Pickett, D. Green, Design of a compound screening collection for use in high throughput screening. Combinatorial Chemistry & High Throughput Screening. ,vol. 7, pp. 63- 70 ,(2004) , 10.2174/138620704772884832
Sheo B. Singh, Fernando Pelaez, Biodiversity, chemical diversity and drug discovery. Progress in drug research. ,vol. 65, pp. 141- 174 ,(2008) , 10.1007/978-3-7643-8117-2_4
Manjoosha Srivastava, Ankita Misra, Sharad Srivastava, Aks Rawat, Garima Pandey, A Review on Biological and Chemical Diversity in Berberis (Berberidaceae) Excli Journal. ,vol. 14, pp. 247- 267 ,(2015) , 10.17179/EXCLI2014-399
Joanne Y. Yew, Henry Chung, Insect pheromones: An overview of function, form, and discovery. Progress in Lipid Research. ,vol. 59, pp. 88- 105 ,(2015) , 10.1016/J.PLIPRES.2015.06.001
Steve O’Hagan, Douglas B. Kell, Software review: the KNIME workflow environment and its applications in genetic programming and machine learning Genetic Programming and Evolvable Machines. ,vol. 16, pp. 387- 391 ,(2015) , 10.1007/S10710-015-9247-3
Oliver Fiehn, Joachim Kopka, Peter Dörmann, Thomas Altmann, Richard N. Trethewey, Lothar Willmitzer, Metabolite profiling for plant functional genomics Nature Biotechnology. ,vol. 18, pp. 1157- 1161 ,(2000) , 10.1038/81137
Xianshu Qiao, Tianxiang Zhao, Bo Guo, Feng Sha, Fei Zhang, Xiaohong Xie, Jianbin Zhang, Xionghui Wei, Excess properties and spectral studies for binary system tri-ethylene glycol + dimethyl sulfoxide Journal of Molecular Liquids. ,vol. 212, pp. 187- 195 ,(2015) , 10.1016/J.MOLLIQ.2015.09.008