Abstract
Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to consider such testing behaviors. In this study, a new class of mixture IRT models was developed to account for such testing behavior in dichotomous and polytomous items, by assuming test-takers were composed of multiple latent classes and by adding a decrement parameter to each latent class to describe performance decline. Parameter recovery, effect of model misspecification, and robustness of the linearity assumption in performance decline were evaluated using simulations. It was found that the parameters in the new models were recovered fairly well by using the freeware WinBUGS; the failure to account for such behavior by fitting standard IRT models resulted in overestimation of difficulty parameters on items located toward the end of the test and overestimation of test reliability; and the linearity assumption in performance decline was rather robust. An empirical example is provided to illustrate the applications and the implications of the new class of models. Copyright © 2014 by the National Council on Measurement in Education.
Original language | English |
---|---|
Pages (from-to) | 178-200 |
Journal | Journal of Educational Measurement |
Volume | 51 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2014 |