摘要: The measurement of agreement repeat rating is the usual method assessing reliability categorical scales. Measurement also important in genetic twin studies based on One most commonly used methods analysis for both types study kappa coefficient. For scales with more than two categories, one approach to use a single summary While this may be sufficient many studies, some instances investigation heterogeneity pattern give additional insights as there greater pairs categories others. In paper, kappa-type coefficients are model agreement. Constraints added heterogeneous obtain simplified models. Procedures estimation, confidence intervals, and inference these described case ratings per subject sample comparison independent samples. Formulae size power calculation derived using non-central chi-squared distribution. Two simulation carried out check empirical test power. Methods illustrated by examples involving nominal three categories.