Monitoring rater performance over time: A framework for detecting differential accuracy and differential scale category use

Carol M. Myford; Edward W. Wolfe

Journal Article

Monitoring rater performance over time: A framework for detecting differential accuracy and differential scale category use

Journal of Educational Measurement (2009) 46(4) 371-389

DOI: 10.1111/j.1745-3984.2009.00088.x

80Citations

67Readers

Get full text

Abstract

In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition examination, employing a multifaceted Rasch approach to determine whether raters exhibited evidence of two types of differential rater functioning over time (i.e., changes in levels of accuracy or scale category use). Some raters showed statistically significant changes in their levels of accuracy as the scoring progressed, while other raters displayed evidence of differential scale category use over time. © 2009 by the National Council on Measurement in Education.

Cite

CITATION STYLE

APA

Myford, C. M., & Wolfe, E. W. (2009). Monitoring rater performance over time: A framework for detecting differential accuracy and differential scale category use. Journal of Educational Measurement, 46(4), 371–389. https://doi.org/10.1111/j.1745-3984.2009.00088.x

Monitoring rater performance over time: A framework for detecting differential accuracy and differential scale category use

Abstract

Cite

Register to see more suggestions