Exploring metrics for the analysis of code submissions in an introductory data science course

Huy Nguyen; Michelle Lim; Steven Moore; Eric Nyberg; Majd Sakr; John Stamper

Conference ProceedingsOPEN ACCESS

Exploring metrics for the analysis of code submissions in an introductory data science course

ACM International Conference Proceeding Series (2021) 632-638

DOI: 10.1145/3448139.3448209

7Citations

23Readers

Abstract

While data science education has gained increased recognition in both academic institutions and industry, there has been a lack of research on automated coding assessment for novice students. Our work presents a first step in this direction, by leveraging the coding metrics from traditional software engineering (Halstead Volume and Cyclomatic Complexity) in combination with those that reflect a data science project's learning objectives (number of library calls and number of common library calls with the solution code). Through these metrics, we examined the code submissions of 97 students across two semesters of an introductory data science course. Our results indicated that the metrics can identify cases where students had overly complicated codes and would benefit from scaffolding feedback. The number of library calls, in particular, was also a significant predictor of changes in submission score and submission runtime, which highlights the distinctive nature of data science programming. We conclude with suggestions for extending our analyses towards more actionable intervention strategies, for example by tracking the fine-grained submission grading outputs throughout a student's submission history, to better model and support them in their data science learning process.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Nguyen, H., Lim, M., Moore, S., Nyberg, E., Sakr, M., & Stamper, J. (2021). Exploring metrics for the analysis of code submissions in an introductory data science course. In ACM International Conference Proceeding Series (pp. 632–638). Association for Computing Machinery. https://doi.org/10.1145/3448139.3448209

Readers' Seniority

PhD / Post grad / Masters / Doc 8

50%

Researcher 5

31%

Professor / Associate Prof. 2

13%

Lecturer / Post doc 1

Readers' Discipline

Computer Science 12

75%

Social Sciences 2

13%

Arts and Humanities 1

Engineering 1

Exploring metrics for the analysis of code submissions in an introductory data science course

Abstract

Author supplied keywords

References Powered by Scopus

SciPy 1.0: fundamental algorithms for scientific computing in Python

Array programming with NumPy

A Complexity Measure

Cited by Powered by Scopus

Investigating Student Mistakes in Introductory Data Science Programming

Automated Grading and Feedback of Programming Assignments

Grading Programming Assignments with an Automated Grading and Feedback Assistant

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline