This study was designed to determine how well existing analytic rating scales functioned in the assessment of low- to mid-proficiency Japanese university students’ interactive English speaking ability when engaged in small group discussions. Many-facet Rasch measurement (MFRM) was employed to evaluate the quality of adapted rating scales for complexity, accuracy, and fluency (CAF), interaction, and communicative effectiveness. The video-recorded performances of 64 participants who completed 10-min group discussion tasks at the beginning and end of their first semester of university study were independently rated by four experienced raters using 9-point rating scales and the resulting scores were subjected to many-facet Rasch measurement (MFRM). Although the scores demonstrated acceptable fit to the Rasch model, closer inspection of the data using Linacre’s (J Appl Meas 3:85–106, 2002a) guidelines for post hoc evaluation of rating scale category quality revealed multiple problems with the 9-point scales and suggested four major revisions were likely to improve the scales for use in this context. The resulting five 5-point rating scales developed through these revisions were then used by the same raters to reassess the same task performances. The 5-point rating scale data was then subjected to the same manner of MFRM analyses and found to demonstrate notably improved functioning and quality.
CITATION STYLE
McDonald, K. (2018). Post hoc evaluation of analytic rating scales for improved functioning in the assessment of interactive L2 speaking ability. Language Testing in Asia, 8(1). https://doi.org/10.1186/s40468-018-0074-3
Mendeley helps you to discover research relevant for your work.