Background: Sensitivity and threshold are two important elements in making judgment (Jackson 1972). When raters are recruited to mark responses to constructed-response items, the effect induced by raters should be fully considered. However, the standard facets model (Linacre 1989) or generalized facets model (Wang and Liu 2007) accounts for the threshold element (rater severity) and does not consider the sensitivity element. Aims and Keywords: Rater sensitivity can be defined as the effectiveness of a rater in differentiating ratees with varies degree of proficiency. By treating the combination of an item and a rater as a pseudo-item, we intend to decompose the attached slope (discrimination) parameter on the pseudo-item into two parts: item discrimination and rater sensitivity. Sample: A dataset gathered by Congdon and McQueen (1997) was analyzed, in which each of the 8,296 students’ writing scripts was graded by two raters randomly chosen from a set of 16 raters on two criteria (items) against a six-point scale. Methods: Different formulations of facets models were fitted by using the freeware WinBUGS. Results: The results supported our expectation: the two rating criteria had degrees of discrimination power; the 16 raters had different sensitivities; and the slope of the pseudo-items was approximately the product of item discrimination and rater sensitivity. Conclusions: This resulting generalized facets model can account for item characteristics (difficulty and discrimination) and rater properties (severity and sensitivity) simultaneously.
|Publication status||Published - 2012|