Examining the fairness of language test across gender with irt-based differential item and test functioning methods

1Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

Test fairness is an important indicator of the validity of test results. The fairness and equity require ensuring that the background characteristics of test-takers, such as ethnicity and gender, do not affect their test scores. Differential item functioning (DIF) methods are commonly used to detect potentially biased items that lead to the unfair assessment of the performance of test-takers with the same ability levels coming from the different cultural, social, demographic, and linguistic backgrounds. This study aims at detecting potentially biased items across gender and examining their effect on test scores to ensure the fairness of test results for each domain and the entire test. Item response theory (IRT) based Lord’s chi-square DIF method at item level and Mantel-Haenszel/Liu-Agresti differential test functioning (DTF) method at test level were implemented to the English Placement Tests (EPT) administered to high school graduates by the National Center for Assessment. The results show that 6 items of the EPT exhibit DIF for the entire test. Two of them are related to reading comprehension and four to the structure domain, while none of the compositional analysis methods shows DIF. These results indicate the existence of content specific DIF effect. Additionally, two items exhibit uniform DIF, one of which shows DIF favoring male students and the offer favoring female students. The small to moderate DTF effect associated with sub-domains and the entire test imply that DIF effects cancel each out, assuring the fairness of results at test level. However, the items with substantially high DIF values need to be examined by content experts to determine the possible cause of DIF effects to avoid gender bias and unfair test outcomes. We also suggest conducting further studies to investigate the reasons behind the content specific DIF effects in language tests.

Cite

CITATION STYLE

APA

Ozdemir, B., & Alshamrani, A. H. (2020). Examining the fairness of language test across gender with irt-based differential item and test functioning methods. International Journal of Learning, Teaching and Educational Research, 19(6), 27–45. https://doi.org/10.26803/ijlter.19.6.2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free