Comparison of the NCI open database with seven large chemical structural databases.

作者: Johannes H. Voigt , Bruno Bienfait , Shaomeng Wang , Marc C. Nicklaus

DOI: 10.1021/CI000150T

关键词: Crystallographic dataDatabaseChemical databaseCrystallographic databaseInformation retrievalIndex (publishing)DirectoryComputer science

摘要: Eight large chemical databases have been analyzed and compared to each other. Central this comparison is the open National Cancer Institute (NCI) database, consisting of approximately 250 000 structures. The other are Available Chemicals Directory ("ACD," from MDL, release 1.99, 3D-version); ChemACX ("ACX," CamSoft, Version 4.5); Maybridge Catalog Asinex database (both as distributed by CamSoft part ChemInfo Sigma-Aldrich (CD-ROM, 1999 Version); World Drug Index ("WDI," Derwent, version 1999.03); organic Cambridge Crystallographic Database ("CSD," Data Center, 5.18). properties internal duplication rates; compounds unique database; cumulative occurrence in an increasing number databases; overlap identical between two similarity overlap; diversity; others. crystallographic CSD WDI show somewhat less with than those In particular collections commercial compilations vendor catalogs a substantial degree among Still, no completely subset any other, appears its own niche thus "raison d'etre". NCI has far highest that it. Approximately 200 structures were not found databases.

参考文章(26)
Ramaswamy Nilakantan, Norman Bauman, Kevin S. Haraki, Database diversity assessment: new ideas, concepts, and tools. Journal of Computer-aided Molecular Design. ,vol. 11, pp. 447- 452 ,(1997) , 10.1023/A:1007937308615
G. W. A. Milne, J. A. Miller, J. R. Hoover, The NCI drug information system. 4. Inventory and shipping modules Journal of Chemical Information and Computer Sciences. ,vol. 26, pp. 179- 185 ,(1986) , 10.1021/CI00052A005
John D. Holliday, Sonia S. Ranade, Peter Willett, A Fast Algorithm For Selecting Sets Of Dissimilar Molecules From Large Chemical Databases Quantitative Structure-activity Relationships. ,vol. 14, pp. 501- 506 ,(1995) , 10.1002/QSAR.19950140602
G. W. A. Milne, Alfred Feldman, J. A. Miller, G. P. Daly, The NCI drug information system. 3. The DIS chemistry module Journal of Chemical Information and Computer Sciences. ,vol. 26, pp. 168- 179 ,(1986) , 10.1021/CI00052A004
Yukio Tominaga, DATA STRUCTURE COMPARISON USING BOX COUNTING ANALYSIS Journal of Chemical Information and Computer Sciences. ,vol. 38, pp. 867- 875 ,(1998) , 10.1021/CI9802070
Robert D. Brown, Yvonne C. Martin, Use of Structure−Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection Journal of Chemical Information and Computer Sciences. ,vol. 36, pp. 572- 584 ,(1996) , 10.1021/CI9501047
Ling Xue, Jeffrey W. Godden, Jürgen Bajorath, Database Searching for Compounds with Similar Biological Activity Using Short Binary Bit String Representations of Molecules Journal of Chemical Information and Computer Sciences. ,vol. 39, pp. 881- 886 ,(1999) , 10.1021/CI990308D
Jens Sadowski, Hugo Kubinyi, A Scoring Scheme for Discriminating between Drugs and Nondrugs Journal of Medicinal Chemistry. ,vol. 41, pp. 3325- 3329 ,(1998) , 10.1021/JM9706776
Markus Wagener, Vincent J. van Geerestein, Potential drugs and nondrugs: prediction and identification of important structural features Journal of Chemical Information and Computer Sciences. ,vol. 40, pp. 280- 292 ,(2000) , 10.1021/CI990266T