Comparing overlapping data distributions using visualization

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present results from a preregistered and crowdsourced user study where we asked members of the general population to determine whether two samples represented using different forms of data visualizations are drawn from the same or different populations. Such a task reduces to assessing whether the overlap between the two visualized samples is large enough to suggest similar or different origins. When using idealized normal curves fitted on the samples, it is essentially a graphical formulation of the classic Student’s t-test. However, we speculate that using more sophisticated visual representations, such as bar histograms, Wilkinson dot plots, strip plots, or Tukey boxplots will both allow people to be more accurate at this task as well as better understand its meaning. In other words, the purpose of our study is to explore which visualization best scaffolds novices in making graphical inferences about data. However, our results indicate that the more abstracted idealized bell curve representation of the task yields more accuracy.

Cite

CITATION STYLE

APA

Newburger, E., & Elmqvist, N. (2023). Comparing overlapping data distributions using visualization. Information Visualization, 22(4), 291–306. https://doi.org/10.1177/14738716231173731

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free