Computing inter‐rater reliability and its variance in the presence of high agreement

作者: Kilem Li Gwet

DOI: 10.1348/000711006X126600

关键词:

摘要: Pi (pi) and kappa (kappa) statistics are widely used in the areas of psychiatry psychological testing to compute extent agreement between raters on nominally scaled data. It is a fact that these coefficients occasionally yield unexpected results situations known as paradoxes kappa. This paper explores origin limitations, introduces an alternative more stable coefficient referred AC1 coefficient. Also proposed new variance estimators for multiple-rater generalized pi statistics, whose validity does not depend upon hypothesis independence raters. improvement over existing variances, which assumption. A Monte-Carlo simulation study demonstrates confidence interval construction, confirms value improved inter-rater reliability statistics.

参考文章(15)
Joseph L. Fleiss, Measuring nominal scale agreement among many raters. Psychological Bulletin. ,vol. 76, pp. 378- 382 ,(1971) , 10.1037/H0031619
Anthony J. Conger, Integration and generalization of kappas for multiple raters. Psychological Bulletin. ,vol. 88, pp. 322- 328 ,(1980) , 10.1037/0033-2909.88.2.322
Joseph L. Fleiss, Jacob Cohen, B. S. Everitt, Large sample standard errors of kappa and weighted kappa. Psychological Bulletin. ,vol. 72, pp. 323- 327 ,(1969) , 10.1037/H0028106
Richard J. Light, Measures of response agreement for qualitative data: Some generalizations and alternatives. Psychological Bulletin. ,vol. 76, pp. 365- 377 ,(1971) , 10.1037/H0031643
J.W. Holley, J.P. Guilford, A Note on the G Index of Agreement Educational and Psychological Measurement. ,vol. 24, pp. 749- 753 ,(1964) , 10.1177/001316446402400402
Jacob Cohen, A Coefficient of agreement for nominal Scales Educational and Psychological Measurement. ,vol. 20, pp. 37- 46 ,(1960) , 10.1177/001316446002000104
Helena Chmura Kraemer, Ramifications of a population model for κ as a coefficient of reliability Psychometrika. ,vol. 44, pp. 461- 472 ,(1979) , 10.1007/BF02296208
Domenic V. Cicchetti, Alvan R. Feinstein, High agreement but low kappa: II. Resolving the paradoxes. Journal of Clinical Epidemiology. ,vol. 43, pp. 551- 558 ,(1990) , 10.1016/0895-4356(90)90159-M