Background: Essay items have been widely used in educational tests. Often, raters need to be recruited to mark essay items and raters may have very different degrees of severity. Furthermore, a rater may show different degrees of severity for different groups of examinees, which is referred to as differential rater functioning (DRF). In most DRF studies, the group memberships of examinees are known, for example, gender or ethnicity. However, DRF may occur when the group memberships of examinees are unknown. Aims and keywords: This study aims to develop a new mixture facets model to assess DRF when the group membership is unknown. Sample: Simulated datasets were generated with two latent classes by manipulating six conditions: (a) sample size; (b) group mean difference; (c) number of items; (d) number of raters; (e) magnitude of DRF; and (f) tendency of DRF. Methods: The generating model was fit to the simulated datasets. The dependence variables were the bias and root mean square error of parameter estimates, and the accuracy of identification of latent classes. Twenty replications were made under each condition. A Bayesian method was used to estimate the parameters by using the WinBUGS freeware. Results: The parameter estimation and the classification of latent classes were more accurate when the dataset was larger (i.e., larger sample sizes, longer tests, and more raters), the group mean difference existed, the differences in the rater parameters between latent classes were larger, and the pattern of DRF was balanced between groups. Conclusions: Fitting the proposed mixture facets model is useful to explore the inconsistency of rater severity with respect to different latent classes.
|Publication status||Published - 2012|