cuDTW++: Ultra-fast dynamic time warping on CUDA-enabled GPUs

5Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Dynamic Time Warping (DTW) is a widely used distance measure in the field of time series data mining. However, calculation of DTW scores is compute-intensive since the complexity is quadratic in terms of time series lengths. This renders important data mining tasks computationally expensive even for moderate query lengths and database sizes. Previous solutions to accelerate DTW on GPUs are not able to fully exploit their compute performance due to inefficient memory access schemes. In this paper, we introduce a novel parallelization strategy to drastically speed-up DTW on CUDA-enabled GPUs based on using low latency warp intrinsics for fast inter-thread communication. We show that our CUDA parallelization (cuDTW++) is able to achieve over 90% of the theoretical peak performance of modern Volta-based GPUs, thereby clearly outperforming the previously fastest CUDA implementation (cudaDTW) by over one order-of-magnitude. Furthermore, cuDTW++ achieves two-to-three orders-of-magnitude speedup over the state-of-the-art CPU program UCR-Suite for subsequence search of ECG signals.

Cite

CITATION STYLE

APA

Schmidt, B., & Hundt, C. (2020). cuDTW++: Ultra-fast dynamic time warping on CUDA-enabled GPUs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12247 LNCS, pp. 597–612). Springer. https://doi.org/10.1007/978-3-030-57675-2_37

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free