Data smashing: Uncovering lurking order in data

Ishanu Chattopadhyay; Hod Lipson

Journal ArticleOPEN ACCESS

Data smashing: Uncovering lurking order in data

Journal of the Royal Society Interface (2014) 11(101)

DOI: 10.1098/rsif.2014.0826

15Citations

78Readers

Get full text

Abstract

From automatic speech recognition to discovering unusual stars, underlying almost all automated discovery tasks is the ability to compare and contrast data streams with each other, to identify connections and spot outliers. Despite the prevalence of data, however, automated methods are not keeping pace. A key bottleneck is that most data comparison algorithms today rely on a human expert to specifywhat 'features' of the data are relevant for comparison. Here, we propose a new principle for estimating the similarity between the sources of arbitrary data streams, using neither domain knowledge nor learning. We demonstrate the application of this principle to the analysis of data from a number of real-world challenging problems, including the disambiguation of electro-encephalograph patterns pertaining to epileptic seizures, detection of anomalous cardiac activity fromheart sound recordings and classification of astronomical objects from raw photometry. In all these cases and without access to any domain knowledge, we demonstrate performance on a par with the accuracy achieved by specialized algorithms and heuristics devised by domain experts. We suggest that data smashing principles may open the door to understanding increasingly complex observations, especially when experts do not know what to look for.

Author supplied keywords

Cite

CITATION STYLE

APA

Chattopadhyay, I., & Lipson, H. (2014). Data smashing: Uncovering lurking order in data. Journal of the Royal Society Interface, 11(101). https://doi.org/10.1098/rsif.2014.0826

Data smashing: Uncovering lurking order in data

Abstract

Author supplied keywords

Cite

Register to see more suggestions