作者: Juan F Morales , Sara Chuguransky , Lucas N Alberca , Juan I Alice , Sofía Goicoechea
DOI: 10.2174/1871525718666200219130229
关键词:
摘要: Background Since their introduction in the virtual screening field, Receiver Operating Characteristic (ROC) curve-derived metrics have been widely used for benchmarking of computational methods and algorithms intended applications. Whereas classification problems, ratio between sensitivity specificity a given score value is very informative, practical concern campaigns to predict actual probability that predicted hit will prove truly active when submitted experimental testing (in other words, Positive Predictive Value - PPV). Estimation such however, obstructed due its dependency on yield actives screened library, which cannot be known priori. Objective To explore use PPV surfaces derived from simulated ranking experiments (retrospective screening) as complementary tool ROC curves, both optimization cutoff values. Methods The utility proposed approach assessed retrospective with four datasets infer QSAR classifiers: inhibitors Trypanosoma cruzi trypanothione synthetase; brucei N-myristoyltransferase; GABA transaminase anticonvulsant activity 6 Hz seizure model. Results Besides illustrating compare performance machine learning models applications select an adequate threshold, our results also suggest ensemble provides better predictivity more robust behavior. Conclusion are valuable tools assess choose thresholds applied prospective silico screens. Ensemble approaches seem consistently lead improved robustness.