Student retention is an important issue for all university policy makers due to the potential negative impact on the image of the univer-sity and the career path of the dropouts. Although this issue has been thoroughly studied by many institutional researchers using parametric tech-niques, such as regression analysis and logit modeling, this article attempts to bring in a new perspective by exploring the issue with the use of three data mining techniques, namely, classification trees, multivariate adaptive regression splines (MARS), and neural networks. Data mining procedures identify transferred hours, residency, and ethnicity as crucial factors to re-tention. Carrying transferred hours into the university implies that the stu-dents have taken college level classes somewhere else, suggesting that they are more academically prepared for university study than those who have no transferred hours. Although residency was found to be a crucial predic-tor to retention, one should not go too far as to interpret this finding that retention is affected by proximity to the university location. Instead, this is a typical example of Simpson's Paradox. The geographical information system analysis indicates that non-residents from the east coast tend to be more persistent in enrollment than their west coast schoolmates.
CITATION STYLE
Yu, C. H., DiGangi, S., Jannasch-Pennell, A., & Kaprolet, C. (2021). A Data Mining Approach for Identifying Predictors of Student Retention from Sophomore to Junior Year. Journal of Data Science, 8(2), 307–325. https://doi.org/10.6339/jds.2010.08(2).574
Mendeley helps you to discover research relevant for your work.