This study aimed at examining the issues affecting the use of IRT models in investigating differential item functioning in high stakes testing. It specifically focused on the Iranian National University Entrance Exam (INUEE) Special English Subtest. A sample of 200,000 participants was randomly selected from the candidates taking part in the INUEE 2003 and 2004 respectively. The data collected in six domains of vocabulary, grammar, word order, language function, cloze test and reading comprehension were analyzed to evaluate the applicability of item response theory (IRT; Embretson & Reise, 2000), including the use of IRT for assessing differential item functioning (DIF; Zumbo, 2007). Substantial model-data misfit was observed in calibrations using PARSCALE and BILOG MG software (Scientific Software International, 2004). Additional analysis through Xcalibre and Iteman 4 (Assessment Systems Corporation, 2010) suggested that item response theory, including IRT-based DIF analysis, is not applicable when the test administered is noticeably beyond the participants' level of capability, when the test is speeded, or if students are penalized for their wrong answers [ABSTRACT FROM AUTHOR]
CITATION STYLE
Ahmadi, A., & Thompson, N. A. (2012). Issues Affecting Item Response Theory Fit in Language Assessment: A Study of Differential Item Functioning in the Iranian National University Entrance Exam. Journal of Language Teaching and Research, 3(3). https://doi.org/10.4304/jltr.3.3.401-412
Mendeley helps you to discover research relevant for your work.