This paper describes a study investigating discrepancies between raters on an English language writing test public examination in Hong Kong where paper-based marking in public examinations is soon to be replaced completely by onscreen marking. The discovery and development of the phenomenon of discrepancy scripts arose from an analysis of data in a related study in which 30 raters rated, on paper, scripts they had at some time previously rated on screen. In that study, ratings on a number of scripts revealed that at least one rater had rated them more than 20 percent (5/24 points) more severely on paper than on screen, while at least one other rater had rated the same scripts more than 20 percent (5/24 points) more severely on screen than they had on paper. The current study was therefore constructed to investigate these 'discrepancy scripts' in an attempt to discover whether specific causes of such discrepancies could be identified. A set of 15 scripts were identified across the two marking mediums. These scripts had received grades diverging by 5/24 points or more. A further control set of scripts was also identified comprising 15 scripts which had received exactly the same grade from two different raters. 12 raters were used in a crossed design. Six rated the discrepancy scripts on screen while the other six rated them on paper. The two groups of raters then changed around, rating the other set of scripts in the other medium. While rating, they noted whether particular scripts were easy or difficult to rate. After the rating exercise, the raters took part in semi-structured interviews. From the analysis involving multi-faceted Rasch measurement, expectations that the discrepancy scripts would show greater misfit in the Rasch model than the same-grade scripts were not borne out. Likewise, raters' evaluations, based on two topics in different genres, showed no bias in the discrepancy scripts, nor did any features emerge which might allow a definition of discrepancy scripts to be developed. The paper concludes that some variation may be inevitable and may have to be accepted in any rating situation. Copyright © 2009 NLLIA Language Testing Research Centre.
|Journal||Melbourne Papers in Language Testing|
|Publication status||Published - 2009|