Top-k most similar trajectories search (k-NN) is frequently used as classification algorithm and recommendation systems in spatialtemporal trajectory databases. However, k-NN trajectories is a complex operation, and a multi-user application should be able to process multiple k-NN trajectories search concurrently in large-scale data in an efficient manner. The k-NN trajectories problem has received plenty of attention, however, state-of-the-art works neither consider in-memory parallel processing of k-NN trajectories nor concurrent queries in distributed environments, or consider parallelization of k-NN search for simpler spatial objects (i.e. 2D points) using MapReduce, but ignore the temporal dimension of spatial-temporal trajectories. In this work we propose a distributed parallel approach for k-NN trajectories search in a multi-user environment using MapReduce in-memory. We propose a space/time data partitioning based on Voronoi diagrams and time pages, named Voronoi Pages, in order to provide both spatial-temporal data organization and process decentralization. In addition, we propose a spatialtemporal index for our partitions to efficiently prune the search space, improve system throughput and scalability.We implemented our solution on top of Spark’s RDD data structure, which provides a thread-safe environment for concurrent MapReduce tasks in main-memory. We perform extensive experiments to demonstrate the performance and scalability of our approach.
CITATION STYLE
Peixoto, D. A., & Hung, N. Q. V. (2016). Scalable and fast top-k most similar trajectories search using MapReduce in-memory. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9877 LNCS, pp. 228–241). Springer Verlag. https://doi.org/10.1007/978-3-319-46922-5_18
Mendeley helps you to discover research relevant for your work.