作者: Beatrice Berthon , Emiliano Spezi , Paulina Galavis , Tony Shepherd , Aditya Apte
DOI: 10.1002/MP.12312
关键词: Similarity (geometry) 、 Benchmark (computing) 、 Benchmarking 、 Set (abstract data type) 、 Software 、 Metric (unit) 、 Computer science 、 Nuclear medicine 、 Data mining
摘要: PURPOSE: The aim of this paper is to define the requirements and describe design implementation a standard benchmark tool for evaluation validation PET-auto-segmentation (PET-AS) algorithms. This work follows recommendations Task Group 211 (TG211) appointed by American Association Physicists in Medicine (AAPM). METHODS: The published AAPM TG211 report were used derive set required features guide structure benchmarking software tool. These items included selection appropriate representative data reference contours obtained from established approaches description available metrics. The was designed way that it could be extendable inclusion bespoke segmentation methods, while maintaining its main purpose being testing platform newly developed PET-AS methods. An example proposed framework, named PETASset, built. In work, methods representing common PET image evaluated within PETASset demonstrating capabilities as platform. RESULTS: A clinical, physical, simulated phantom data, including "best estimates" macroscopic specimens, simulation template, CT scans built into application database. Specific metrics such Dice Similarity Coefficient (DSC), Positive Predictive Value (PPV), Sensitivity (S), allow user compare results any given algorithm contours. addition, generate structured reports on performance algorithms against variation metric agreement values with across demonstration between 0.51 0.83, 0.44 0.86, 0.61 1.00 DSC, PPV, S metric, respectively. Examples limits provided show how evaluate new existing state-of-the art. CONCLUSIONS: PETASset provides allows standardizing comparison different wide range datasets. will users willing their contribute more