Comparison of Inter-Rater Reliability Techniques in Performance-Based Assessment

Sinem ARSLAN MANCAR; Hamide Deniz GÜLLEROĞLU

Journal ArticleOPEN ACCESS

Comparison of Inter-Rater Reliability Techniques in Performance-Based Assessment

ARSLAN MANCAR S
GÜLLEROĞLU H

International Journal of Assessment Tools in Education (2022) 9(2) 515-533

DOI: 10.21449/ijate.993805

N/ACitations

7Readers

Abstract

The aim of this study is to analyse the importance of the number of raters and compare the results obtained by techniques based on Classical Test Theory (CTT) and Generalizability (G) Theory. The Kappa and Krippendorff alpha techniques based on CTT were used to determine the inter-rater reliability. In this descriptive research data consists of twenty individual investigation performance reports prepared by the learners of the International Baccalaureate Diploma Programme (IBDP) and also five raters who rated these reports. Raters used an analytical rubric developed by the International Baccalaureate Organization (IBO) as a scoring tool. The results of the CTT study show that Kappa and Krippendorff alpha statistical techniques failed to provide information about the sources of the errors causing incompatibility in the criteria. The studies based on G Theory provided comprehensive data about the sources of the errors and increasing the number of raters would also increase the reliability of the values. However, the raters raised the idea that it is important to develop descriptors in the criteria in the rubric.The aim of this study is to analyse the importance of the number of raters and compare the results obtained by techniques based on Classical Test Theory (CTT) and Generalizability (G) Theory. The Kappa and Krippendorff alpha techniques based on CTT were used to determine the inter-rater reliability. In this descriptive research data consists of twenty individual investigation performance reports prepared by the learners of the International Baccalaureate Diploma Programme (IBDP) and also five raters who rated these reports. Raters used an analytical rubric developed by the International Baccalaureate Organization (IBO) as a scoring tool. The results of the CTT study show that Kappa and Krippendorff alpha statistical techniques failed to provide information about the sources of the errors causing incompatibility in the criteria. The studies based on G Theory provided comprehensive data about the sources of the errors and increasing the number of raters would also increase the reliability of the values. However, the raters raised the idea that it is important to develop descriptors in the criteria in the rubric.

Cite

CITATION STYLE

APA

ARSLAN MANCAR, S., & GÜLLEROĞLU, H. D. (2022). Comparison of Inter-Rater Reliability Techniques in Performance-Based Assessment. International Journal of Assessment Tools in Education, 9(2), 515–533. https://doi.org/10.21449/ijate.993805

Comparison of Inter-Rater Reliability Techniques in Performance-Based Assessment

Abstract

Cite

Register to see more suggestions