Human Learning in Data Science

0Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

As machine learning becomes a more and more important area in Data Science, bringing with it a rise of abstractness and complexity, the desire for explainability rises, too. With our work we aim to gain explainability focussing on correlation clustering and try to pursue the original goals of different Data Science tasks,: Extracting knowledge from data. As well-known tools like Fold-It or GeoTime show, gamification is a very mighty approach, but not only to solve tasks which prove more difficult for machines than for humans. We could also gain knowledge from how players proceed trying to solve those difficult tasks. That is why we developed Straighten it up!, a game in which users try to find the best linear correlations in high dimensional datasets. Finding arbitrarily oriented subspaces in high dimensional data is an exponentially complex task due to the number of potential subspaces in regards to the number of dimensions. Nevertheless, linearly correlated points are as a simple pattern easy to track by the human eye. Straighten it up! gives users an overview over two-dimensional projections of a self-chosen dataset. Users decide which subspace they want to examine first, and can draw in arbitrarily many lines fitting the data. An offset inside of which points are assigned to the corresponding line can easily be chosen for every line independently, and users can switch between different projections at any time. We developed a scoring system not only as incentive, but first of all for further examination, based on the density of each cluster, its minimum spanning tree, size of offset, and coverage. By tracking every step of a user we are able to detect common mechanisms and examine differences to state-of-the-art correlation and subspace clustering algorithms, resulting in more comprehensibility.

Cite

CITATION STYLE

APA

Beer, A., Kazempour, D., Baur, M., & Seidl, T. (2019). Human Learning in Data Science. In Communications in Computer and Information Science (Vol. 1033, pp. 170–176). Springer Verlag. https://doi.org/10.1007/978-3-030-23528-4_24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free