In this paper, we detail the successful deployment of a machine learning autograder that significantly decreases the grading labor required in the Breakout computer science assignment. This assignment - which tasks students with programming a game consisting of a controllable paddle and a ball that bounces off the paddle to break bricks - is popular for engaging students with introductory computer science concepts, but creates a large grading burden. Due to the game's interactive nature, grading defies traditional unit tests and instead typically requires 8+ minutes of manually playing each student's game to search for bugs. This amounts to 45+ hours of grading in a standard course offering and prevents further widespread adoption of the assignment. Our autograder alleviates this burden by playing each student's game with a reinforcement learning agent and providing videos of discovered bugs to instructors. In an A/B test with manual grading, we find that our human-in-the-loop AI autograder reduces grading time by 44%, while slightly improving grading accuracy by 6%, ultimately saving roughly 30 hours over our deployment in two offerings of the assignment. Our results further suggest the practicality of grading other interactive assignments (e.g., other games or building websites) via similar machine learning techniques. Live demo at https://ezliu.github.io/breakoutgrader.
CITATION STYLE
Liu, E. Z., Yuan, D., Ahmed, A., Cornwall, E., Woodrow, J., Burns, K., … Finn, C. (2024). A Fast and Accurate Machine Learning Autograder for the Breakout Assignment. In SIGCSE 2024 - Proceedings of the 55th ACM Technical Symposium on Computer Science Education (Vol. 1, pp. 736–742). Association for Computing Machinery, Inc. https://doi.org/10.1145/3626252.3630759
Mendeley helps you to discover research relevant for your work.