G is for Generalisation: Predicting Student Success from Keystrokes

Zac Pullar-Strecker; Filipe Dwan Pereira; Paul Denny; Andrew Luxton-Reilly; Juho Leinonen

Conference ProceedingsOPEN ACCESS

G is for Generalisation: Predicting Student Success from Keystrokes

SIGCSE 2023 - Proceedings of the 54th ACM Technical Symposium on Computer Science Education (2023) 1 1028-1034

DOI: 10.1145/3545945.3569824

7Citations

12Readers

Get full text

Abstract

Student performance prediction aims to build models to help educators identify struggling students so they can be better supported. However, prior work in the space frequently evaluates features and models on data collected from a single semester, of a single course, taught at a single university. Without evaluating these methods in a broader context there is an open question of whether or not performance prediction methods are capable of generalising to new data. We test three methods for evaluating student performance models on data from introductory programming courses from two universities with a total of 3,323 students. Our results suggest that using cross-validation on one semester is insufficient for gauging model performance in the real world. Instead, we suggest that where possible future work in student performance prediction collects data from multiple semesters and uses one or more as a distinct hold-out set. Failing this, bootstrapped cross-validation should be used to improve confidence in models' performance. By recommending stronger methods for evaluating performance prediction models, we hope to bring them closer to practical use and assist teachers to understand struggling students in novice programming courses.

Author supplied keywords

Cite

CITATION STYLE

APA

Pullar-Strecker, Z., Pereira, F. D., Denny, P., Luxton-Reilly, A., & Leinonen, J. (2023). G is for Generalisation: Predicting Student Success from Keystrokes. In SIGCSE 2023 - Proceedings of the 54th ACM Technical Symposium on Computer Science Education (Vol. 1, pp. 1028–1034). Association for Computing Machinery, Inc. https://doi.org/10.1145/3545945.3569824

G is for Generalisation: Predicting Student Success from Keystrokes

Abstract

Author supplied keywords

Cite

Register to see more suggestions