Methods from item response theory: Going beyond traditional validity and reliability in standardizing assessments

0Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In determining the effectiveness of educational interventions, the Gold Standard requires the use of tests and assessments of proven validity. Messick (1989) defined validity as an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores (p. 13). Education researchers wishing to evaluate the effectiveness of educational interventions and programs under the Gold Standard must either develop and validate their own tests and assessments or use ones developed and validated by others. As a result of the No Child Left Behind federal legislative mandate for Grades K-12 in the United States (NCLB, 2002), educational research through intervention programs that improve student learning in mathematics, reading, and science education in Grades K-12 have one natural test of interest: the standardized examination used in the state for determining student proficiency status and school and district proficiency rates. Local school personnel and state education professionals are particularly interested in research showing improvements in student performance on these high-stakes tests. Other standardized assessments that can be used to show the effectiveness of an educational program or intervention are the National Assessment of Educational Progress (NAEP, US National Center for Education Statistics, n.d.), the ACT®, (ACT, n.d.), and the SAT (College Board, n.d.). However, the use of state NCLB tests and these other assessments is precluded in many situations. For example, the educational program or intervention may be targeted at a subject area not covered by these assessments, such as history or study in a foreign language. Even if the subject area is in mathematics, reading, or science, the goal of the intervention may not align with the underlying curriculum and goals of the NCLB tests in the subject area. For example, programs focusing on the development of problem-solving skills in mathematics may have different goals than the curriculum tested on the NAEP or the NCLB state assessment. These assessments would not be good measures of the effectiveness of this type of intervention program. © 2009 Springer Netherlands.

Cite

CITATION STYLE

APA

Froelich, A. G. (2009). Methods from item response theory: Going beyond traditional validity and reliability in standardizing assessments. In Quality Research in Literacy and Science Education: International Perspectives and Gold Standards (pp. 287–301). Springer Netherlands. https://doi.org/10.1007/978-1-4020-8427-0_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free