There is a broad literature in multiple-choice test development, both in terms of item-writing guidelines, and psychometric functionality as a measurement tool. However, most of the published literature concerns multiple-choice testing in the context of expert-designed high-stakes standardized assessments, with little attention being paid to the use of the technique within non-expert instructor-created classroom examinations. In this work, we present a quantitative analysis of a large corpus of multiple-choice tests deployed in the classrooms of a primarily undergraduate university in Canada. Our report aims to establish three related things. First, reporting on the functional and psychometric operation of 182 multiple-choice tests deployed in a variety of courses at all undergraduate levels of education establishes a much-needed baseline for actual as-deployed classroom tests. Second, we motivate and present modified statistical measures—such as item-excluded correlation measures of discrimination and length-normalized measures of reliability—that should serve as useful parameters for future comparisons of classroom test psychometrics. Finally, we use the broad empirical data from our survey of tests to update widely used item-quality guidelines.
CITATION STYLE
Slepkov, A. D., Van Bussel, M. L., Fitze, K. M., & Burr, W. S. (2021). A Baseline for Multiple-Choice Testing in the University Classroom. SAGE Open, 11(2). https://doi.org/10.1177/21582440211016838
Mendeley helps you to discover research relevant for your work.