Test equating is used to ensure that test scores from different test forms can be used interchangeably. This paper aims to compare the statistical and computational properties from three equating frameworks: item response theory observed-score equating (IRTOSE), kernel equating and kernel IRTOSE. The real data applications suggest that IRT-based frameworks tend to provide more stable and accurate results than kernel equating. Nonetheless, kernel equating can provide satisfactory results if we can find a good model for the data, while also being much faster than the IRT-based frameworks. Our general recommendation is to try all methods and examine how much the equated scores change, always ensuring that the assumptions are met and that a good model for the data can be found.
CITATION STYLE
Leôncio, W., & Wiberg, M. (2018). Evaluating equating transformations from different frameworks. In Springer Proceedings in Mathematics and Statistics (Vol. 233, pp. 101–110). Springer New York LLC. https://doi.org/10.1007/978-3-319-77249-3_9
Mendeley helps you to discover research relevant for your work.