The percentage of examinees who are classified consistently and accurately into the proficiency levels is an important measurement property of the tests that are used to classify the candidates. Given the suspected discrepancies between the classical test theory (CTT)- and item response theory (IRT)-based single-administration decision consistency and accuracy (DC/DA) estimates, these two approaches were evaluated for accuracy and robustness in various simulated conditions by varying the test length, ability distribution, and the degree of local item dependence (LID). The CTT-based Livingston–Lewis method was found underestimating the DC indices across all conditions and more sensitive to the short tests and skewed ability distributions. The IRT-based Lee method had small biases in most conditions except a high degree of LID. The violation of LID had a much greater negative effect on the DA estimate than on the DC estimate with both methods.
CITATION STYLE
Deng, N., & Hambleton, R. K. (2013). Evaluating CTT- and IRT-based single-administration estimates of classification consistency and accuracy. In Springer Proceedings in Mathematics and Statistics (Vol. 66, pp. 235–250). Springer New York LLC. https://doi.org/10.1007/978-1-4614-9348-8_15
Mendeley helps you to discover research relevant for your work.