Early prediction of students at risk of doing poorly in CS1 can enable early interventions or class adjustments. Preferably, prediction methods would be lightweight, not requiring much extra activity or data-collection work from instructors beyond what they already do. Previous methods included giving surveys, collecting (potentially sensitive) demographic data, introducing clicker questions into lectures, or using locally-developed systems that analyze programming behavior, each requiring some effort by instructors. Today, a widely used textbook / learning system in CS1 classes is zyBooks, used by several hundred thousand students annually. The system automatically collects data related to reading, homework, and programming assignments. For a 300+ student CS1 class, we found that three data metrics, auto-collected by that system in early weeks (1-4), were good at predicting performance on the week-6 midterm exam: non-earnest completion of the assigned readings, struggle on the coding homework, and low scores on the programming assignments, with correlation magnitudes of 0.44, 0.58, and 0.72, respectively. We combined those metrics in a decision tree model to predict students at-risk of failing the midterm exam (<70%, meaning D or F), and achieved 85% prediction accuracy with 82% sensitivity and 89% specificity, which is higher than previously-published early-prediction approaches. The approach may mean that thousands of instructors already using zyBooks (or a similar system) can get a more accurate early prediction of at-risk students, without requiring extra effort or activities, and avoiding collection of sensitive demographic data.
CITATION STYLE
Gordon, C., Zhao, S., & Vahid, F. (2023). Ultra-Lightweight Early Prediction of At-Risk Students in CS1. In SIGCSE 2023 - Proceedings of the 54th ACM Technical Symposium on Computer Science Education (Vol. 1, pp. 764–770). Association for Computing Machinery, Inc. https://doi.org/10.1145/3545945.3569764
Mendeley helps you to discover research relevant for your work.