Comparing overlapping data distributions using visualization

Eric Newburger; Niklas Elmqvist

Journal Article

Comparing overlapping data distributions using visualization

Information Visualization (2023) 22(4) 291-306

DOI: 10.1177/14738716231173731

0Citations

2Readers

Get full text

Abstract

We present results from a preregistered and crowdsourced user study where we asked members of the general population to determine whether two samples represented using different forms of data visualizations are drawn from the same or different populations. Such a task reduces to assessing whether the overlap between the two visualized samples is large enough to suggest similar or different origins. When using idealized normal curves fitted on the samples, it is essentially a graphical formulation of the classic Student’s t-test. However, we speculate that using more sophisticated visual representations, such as bar histograms, Wilkinson dot plots, strip plots, or Tukey boxplots will both allow people to be more accurate at this task as well as better understand its meaning. In other words, the purpose of our study is to explore which visualization best scaffolds novices in making graphical inferences about data. However, our results indicate that the more abstracted idealized bell curve representation of the task yields more accuracy.

Author supplied keywords

Cite

CITATION STYLE

APA

Newburger, E., & Elmqvist, N. (2023). Comparing overlapping data distributions using visualization. Information Visualization, 22(4), 291–306. https://doi.org/10.1177/14738716231173731

Comparing overlapping data distributions using visualization

Abstract

Author supplied keywords

Cite

Register to see more suggestions