作者: Steve O'Hagan , Douglas B. Kell
关键词:
摘要: Background. Previous studies compared the molecular similarity of marketed drugs and endogenous human metabolites (endogenites), using a series fingerprint-type encodings, variously ranked clustered Tanimoto (Jaccard) coefficient (TS). Because this gives equal weight to all parts encoding (thence different substructures in molecule) it may not be optimal, since many cases molecule will bind their macromolecular targets. Unsupervised methods cannot alone uncover this. We here explore kinds differences that observed when TS is replaced – manner more equivalent semi-supervised learning by variants asymmetric Tversky (TV) similarity, includes parameters. Results. Dramatic are (i) drug-endogenite heatmaps, (ii) cumulative ‘greatest similarity’ curves, (iii) fraction with metabolite exceeding given value parameters varied from values. The same true sum varied. A clear trend towards increased endogenite-likeness or adopt values nearer extremes range, smaller. molecules exhibiting greatest two interrogating drug (chlorpromazine clozapine) also vary both nature as for converse, interrogated an endogenite. library depends on contents library, ‘tuned’ accordingly, manner. At some discovery candidates natural products can “look” much like (i.e. have numerical closer to) than do even endogenites. Conclusions. Overall, metrics provide useful range examples does simpler help draw attention