Exploring metrics for the analysis of code submissions in an introductory data science course

7Citations
Citations of this article
23Readers
Mendeley users who have this article in their library.

Abstract

While data science education has gained increased recognition in both academic institutions and industry, there has been a lack of research on automated coding assessment for novice students. Our work presents a first step in this direction, by leveraging the coding metrics from traditional software engineering (Halstead Volume and Cyclomatic Complexity) in combination with those that reflect a data science project's learning objectives (number of library calls and number of common library calls with the solution code). Through these metrics, we examined the code submissions of 97 students across two semesters of an introductory data science course. Our results indicated that the metrics can identify cases where students had overly complicated codes and would benefit from scaffolding feedback. The number of library calls, in particular, was also a significant predictor of changes in submission score and submission runtime, which highlights the distinctive nature of data science programming. We conclude with suggestions for extending our analyses towards more actionable intervention strategies, for example by tracking the fine-grained submission grading outputs throughout a student's submission history, to better model and support them in their data science learning process.

References Powered by Scopus

SciPy 1.0: fundamental algorithms for scientific computing in Python

22783Citations
N/AReaders
Get full text

Array programming with NumPy

13721Citations
N/AReaders
Get full text

A Complexity Measure

4378Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Investigating Student Mistakes in Introductory Data Science Programming

2Citations
N/AReaders
Get full text

Automated Grading and Feedback of Programming Assignments

2Citations
N/AReaders
Get full text

Grading Programming Assignments with an Automated Grading and Feedback Assistant

1Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Nguyen, H., Lim, M., Moore, S., Nyberg, E., Sakr, M., & Stamper, J. (2021). Exploring metrics for the analysis of code submissions in an introductory data science course. In ACM International Conference Proceeding Series (pp. 632–638). Association for Computing Machinery. https://doi.org/10.1145/3448139.3448209

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 8

50%

Researcher 5

31%

Professor / Associate Prof. 2

13%

Lecturer / Post doc 1

6%

Readers' Discipline

Tooltip

Computer Science 12

75%

Social Sciences 2

13%

Arts and Humanities 1

6%

Engineering 1

6%

Save time finding and organizing research with Mendeley

Sign up for free