Scalable Machine Learning on High-Dimensional Vectors: From Data Series to Deep Network Embeddings

Karima Echihabi; Kostas Zoumpatianos; Themis Palpanas

Conference ProceedingsOPEN ACCESS

Scalable Machine Learning on High-Dimensional Vectors: From Data Series to Deep Network Embeddings

ACM International Conference Proceeding Series (2020) Part F162565 1-6

DOI: 10.1145/3405962.3405989

6Citations

13Readers

Abstract

There is an increasingly pressing need, by several applications in diverse domains, for developing techniques able to analyze very large collections of static and streaming sequences (a.k.a. data series), predominantly in real-time. Examples of such applications come from Internet of Things installations, neuroscience, astrophysics, and a multitude of other scientific and application domains that need to apply machine learning techniques for knowledge extraction. It is not unusual for these applications, for which similarity search is a core operation, to involve numbers of data series in the order of hundreds of millions to billions, which are seldom analyzed in their full detail due to their sheer size. Such application requirements have driven the development of novel similarity search methods that can facilitate scalable analytics in this context. At the same time, a host of other methods have been developed for similarity search of high-dimensional vectors in general. All these methods are now becoming increasingly important, because of the growing popularity and size of sequence collections, as well as the growing use of high-dimensional vector representations of a large variety of objects (such as text, multimedia, images, audio and video recordings, graphs, database tables, and others) thanks to deep network embeddings. In this work, we review recent efforts in designing techniques for indexing and analyzing massive collections of data series, and argue that they are the methods of choice even for general high-dimensional vectors. Finally, we discuss the challenges and open research problems in this area.

Author supplied keywords

Cite

CITATION STYLE

APA

Echihabi, K., Zoumpatianos, K., & Palpanas, T. (2020). Scalable Machine Learning on High-Dimensional Vectors: From Data Series to Deep Network Embeddings. In ACM International Conference Proceeding Series (Vol. Part F162565, pp. 1–6). Association for Computing Machinery. https://doi.org/10.1145/3405962.3405989

Scalable Machine Learning on High-Dimensional Vectors: From Data Series to Deep Network Embeddings

Abstract

Author supplied keywords

Cite

Register to see more suggestions