This study presents a data science methodology to integrate and explore disparate student data from an engineering-mathematics course. Our methodology is based on exploratory data mining and visualization for analyzing and visualizing raw student data from multiple data sources. The exploratory analysis serves two purposes, 1) it supports the instructor's desire to gain insights into the implementation of a flipped classroom and 2) it serves as a case study for a proposed data science pipeline for educational data. As part of the flipped class, the instructor had students completed assignments in an online homework system before each class meeting and work readiness assessment tests (RATs) at the beginning of each class. The RAT scores were recorded in excel and combined with student performance data that was exported from the online homework system. Paper exams were administered at the end of each unit, and combined with RAT scores, lesson assignment scores, and demographic data. A combination of data mining and classical statistical techniques were used to reveal the trends and peculiarities in the data, without having a specific question or topic to investigate. The data science pipeline which we present has four major stages: data preprocessing, exploratory factor analysis, visualization, and feature engineering. Our study revealed some trends and clusters within and across course units. Analysis results show the differences and similarities within the course units and help track learner behavior. A few differences related to gender were found, but prior experience in a course taught using the flipped classroom model did not show a significant difference. Exploratory factor analysis identified two factors from the whole data: class activities and exams (factor 1) and homeworks and lesson assignments (factor 2). The discovered factors were found to cluster in two groups within the course units: Unit 1 to 7 and Unit 8 to 13, which has a dividing point at the withdraw date. Results also showed that female students had more class activity scores (i.e. they attended and participated in more classes) than male students. Future work will include collecting more data and generating hypotheses that can be tested using collected data.
CITATION STYLE
Sener, A. C. A., Hieb, J. L., & Nasraoui, O. (2019). Using a data science pipeline for course data: A case study analyzing heterogeneous student data in two flipped classes. In ASEE Annual Conference and Exposition, Conference Proceedings. American Society for Engineering Education. https://doi.org/10.18260/1-2--33492
Mendeley helps you to discover research relevant for your work.