Computing inter‐rater reliability and its variance in the presence of high agreement

关键词:

摘要: Pi (pi) and kappa (kappa) statistics are widely used in the areas of psychiatry psychological testing to compute extent agreement between raters on nominally scaled data. It is a fact that these coefficients occasionally yield unexpected results situations known as paradoxes kappa. This paper explores origin limitations, introduces an alternative more stable coefficient referred AC1 coefficient. Also proposed new variance estimators for multiple-rater generalized pi statistics, whose validity does not depend upon hypothesis independence raters. improvement over existing variances, which assumption. A Monte-Carlo simulation study demonstrates confidence interval construction, confirms value improved inter-rater reliability statistics.

参考文章(15)

Joseph L. Fleiss, Measuring nominal scale agreement among many raters. Psychological Bulletin. ,vol. 76, pp. 378- 382 ,(1971) , 10.1037/H0031619

Anthony J. Conger, Integration and generalization of kappas for multiple raters. Psychological Bulletin. ,vol. 88, pp. 322- 328 ,(1980) , 10.1037/0033-2909.88.2.322

Joseph L. Fleiss, Jacob Cohen, B. S. Everitt, Large sample standard errors of kappa and weighted kappa. Psychological Bulletin. ,vol. 72, pp. 323- 327 ,(1969) , 10.1037/H0028106

Richard J. Light, Measures of response agreement for qualitative data: Some generalizations and alternatives. Psychological Bulletin. ,vol. 76, pp. 365- 377 ,(1971) , 10.1037/H0031643

J.W. Holley, J.P. Guilford, A Note on the G Index of Agreement Educational and Psychological Measurement. ,vol. 24, pp. 749- 753 ,(1964) , 10.1177/001316446402400402

J. Richard Landis, Gary G. Koch, An Application of Hierarchical Kappa-type Statistics in the Assessment of Majority Agreement among Multiple Observers Biometrics. ,vol. 33, pp. 363- 374 ,(1977) , 10.2307/2529786

Jacob Cohen, Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin. ,vol. 70, pp. 213- 220 ,(1968) , 10.1037/H0026256

Jacob Cohen, A Coefficient of agreement for nominal Scales Educational and Psychological Measurement. ,vol. 20, pp. 37- 46 ,(1960) , 10.1177/001316446002000104

Helena Chmura Kraemer, Ramifications of a population model for κ as a coefficient of reliability Psychometrika. ,vol. 44, pp. 461- 472 ,(1979) , 10.1007/BF02296208

10.

Domenic V. Cicchetti, Alvan R. Feinstein, High agreement but low kappa: II. Resolving the paradoxes. Journal of Clinical Epidemiology. ,vol. 43, pp. 551- 558 ,(1990) , 10.1016/0895-4356(90)90159-M

Computing inter‐rater reliability and its variance in the presence of high agreement

来源期刊

我的账户

Computing inter‐rater reliability and its variance in the presence of high agreement

来源期刊

相似文章 10

我的账户