作者: Mohsen Sadatsafavi , Mehdi Najafzadeh , Larry Lynd , Carlo Marra
DOI: 10.1016/J.JCLINEPI.2007.10.023
关键词:
摘要: Abstract Objective Any attempt to generalize the performance of a subjective diagnostic method should take into account sample variation in both cases and readers. Most current measures test, especially indices reliability, only tackle cases, hence are not suitable for generalizing results across population We attempted study effect readers' on two multireader reliability: pair-wise agreement Fleiss' kappa. Study Design Setting used normal hierarchical model with latent trait (signal) variable simulate binary decision-making task by different number readers an infinite cases. Results It could be shown that measures, kappa, have large variance when estimated small readers, casting doubt their accuracy given typically reliability studies. Conclusion The majority studies is likely limited unlikely produce reliable estimate reader agreement.