Exploring metrics for the analysis of code submissions in an introductory data science course

6Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.

Abstract

While data science education has gained increased recognition in both academic institutions and industry, there has been a lack of research on automated coding assessment for novice students. Our work presents a first step in this direction, by leveraging the coding metrics from traditional software engineering (Halstead Volume and Cyclomatic Complexity) in combination with those that reflect a data science project's learning objectives (number of library calls and number of common library calls with the solution code). Through these metrics, we examined the code submissions of 97 students across two semesters of an introductory data science course. Our results indicated that the metrics can identify cases where students had overly complicated codes and would benefit from scaffolding feedback. The number of library calls, in particular, was also a significant predictor of changes in submission score and submission runtime, which highlights the distinctive nature of data science programming. We conclude with suggestions for extending our analyses towards more actionable intervention strategies, for example by tracking the fine-grained submission grading outputs throughout a student's submission history, to better model and support them in their data science learning process.

Cite

CITATION STYLE

APA

Nguyen, H., Lim, M., Moore, S., Nyberg, E., Sakr, M., & Stamper, J. (2021). Exploring metrics for the analysis of code submissions in an introductory data science course. In ACM International Conference Proceeding Series (pp. 632–638). Association for Computing Machinery. https://doi.org/10.1145/3448139.3448209

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free