Scalable Machine Learning on High-Dimensional Vectors: From Data Series to Deep Network Embeddings

4Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

There is an increasingly pressing need, by several applications in diverse domains, for developing techniques able to analyze very large collections of static and streaming sequences (a.k.a. data series), predominantly in real-time. Examples of such applications come from Internet of Things installations, neuroscience, astrophysics, and a multitude of other scientific and application domains that need to apply machine learning techniques for knowledge extraction. It is not unusual for these applications, for which similarity search is a core operation, to involve numbers of data series in the order of hundreds of millions to billions, which are seldom analyzed in their full detail due to their sheer size. Such application requirements have driven the development of novel similarity search methods that can facilitate scalable analytics in this context. At the same time, a host of other methods have been developed for similarity search of high-dimensional vectors in general. All these methods are now becoming increasingly important, because of the growing popularity and size of sequence collections, as well as the growing use of high-dimensional vector representations of a large variety of objects (such as text, multimedia, images, audio and video recordings, graphs, database tables, and others) thanks to deep network embeddings. In this work, we review recent efforts in designing techniques for indexing and analyzing massive collections of data series, and argue that they are the methods of choice even for general high-dimensional vectors. Finally, we discuss the challenges and open research problems in this area.

References Powered by Scopus

Multidimensional Binary Search Trees Used for Associative Searching

5564Citations
N/AReaders
Get full text

The R-tree: An Efficient and Robust Access Method for Points and Rectangles

3427Citations
N/AReaders
Get full text

Product quantization for nearest neighbor search

2182Citations
N/AReaders
Get full text

Cited by Powered by Scopus

ELPIS: Graph-Based Similarity Search for Scalable Data Science

17Citations
N/AReaders
Get full text

SEAnet: A Deep Learning Architecture for Data Series Similarity Search

5Citations
N/AReaders
Get full text

Comparison of Modern Deep Learning Models for Speaker Verification

3Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Echihabi, K., Zoumpatianos, K., & Palpanas, T. (2020). Scalable Machine Learning on High-Dimensional Vectors: From Data Series to Deep Network Embeddings. In ACM International Conference Proceeding Series (Vol. Part F162565, pp. 1–6). Association for Computing Machinery. https://doi.org/10.1145/3405962.3405989

Readers over time

‘20‘22‘23‘24036912

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 7

78%

Lecturer / Post doc 1

11%

Researcher 1

11%

Readers' Discipline

Tooltip

Computer Science 6

75%

Engineering 2

25%

Save time finding and organizing research with Mendeley

Sign up for free
0