Computing longest duration flocks in trajectory data
Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems GIS 06 (2006)
- ISBN: 1595935290
- DOI: 10.1145/1183471.1183479
Available from portal.acm.org
or
Abstract
Moving point object data can be analyzed through the discovery of patterns. We consider the computational efficiency of computing two of the most basic spatio-temporal patterns in trajectories, namely flocks and meetings. The patterns are large enough subgroups of the moving point objects that exhibit similar movement and proximity for a certain amount of time. We consider the problem of computing a longest duration flock or meeting. We give several exact and approximation algorithms, and also show that some variants are as hard as MaxClique to compute and approximate.
Author-supplied keywords
Available from portal.acm.org
Page 1
Computing longest duration flocks...
Computing Longest Duration Flocks in Trajectory Data Joachim Gudmundsson ��� National ICT Australia Ltd Sydney, Australia. joachim.gudmundsson@nicta.com.au Marc van Kreveld ��� Institute for Information and Computing Sciences Utrecht University, Utrecht, The Netherlands. marc@cs.uu.nl ABSTRACT Moving point object data can be analyzed through the dis- covery of patterns. We consider the computational e���ciency of computing two of the most basic spatio-temporal pat- terns in trajectories, namely flocks and meetings. The pat- terns are large enough subgroups of the moving point objects that exhibit similar movement and proximity for a certain amount of time. We consider the problem of computing a longest duration flock or meeting. We give several exact and approximation algorithms, and also show that some variants are as hard as MaxClique to compute and approximate. Categories and Subject Descriptors: F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algo- rithms and Problems General Terms: Algorithms, Theory. Keywords: Spatio-temporal patterns, Moving objects, Ge- ometric algorithms, Approximation algorithms. 1. INTRODUCTION Moving point object data is becoming increasingly more available since the development of GPS and radio transmit- ters. One of the objectives of spatio-temporal data min- ing [11, 16, 20] is to analyze such data sets for interesting patterns. For example, a group of caribou with radio collars gives rise to the positions of each caribou in a sequence of time steps. More examples are moose in Sweden (25 ani- mals reported every 30 minutes), leopards in South Africa (32 animals reported daily), mountain goats in USA (32 an- imals reported every 3 hours), and so on [23]. Analyzing this data gives insight into entity behavior, in particular, migration patterns [18]. The analysis of moving objects also ���NICTA is funded by the Australian Government���s Backing Australia���s Ability initiative, in part through the Australian Research Council. ���Supported by the Netherlands Organisation for Scientific Research (NWO) under FOCUS/BRICKS grant number 642.065.503 (GADGET). Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ACM-GIS���06, November 10-11, 2006, Arlington, Virginia, USA. Copyright 2006 ACM 1-59593-529-0/06/0011 ...$5.00. has applications in socio-economic geography [4], transport analysis [19] and in defense and surveillance areas [17]. In general the input is a set P of n moving point objects p1, . . . , pn in the plane whose locations are known at �� con- secutive time steps t1, . . . , t�� , that is, the trajectory of each object is a polygonal line that can self-intersect. For brevity, we will call moving point objects entities from now on. It is assumed that the velocity of an entity along a line segment of the trajectory is constant. There are several slightly different definitions of flocks [2, 13, 14]. We will use the following definition, see Fig. 1 (a). Definition 1. flock(m, k, r): Given a set of n trajecto- ries of entities in the plane, where each trajectory consists of �� line segments, a flock in a time interval I, where the duration of I is at least k, consists of at least m entities such that for every point in time within I there is a disk of radius r that contains all the m entities (note that m ��� N, k ��� R and r 0 are given constants). This definition is almost identical to Definition 1 in [2] with the difference that k is not restricted to be an integer. We consider two variants of flocks. Either the same m en- tities stay together during the entire interval (fixed-flock), or the entities in the flock change during the interval (varying- flock). If the entities may change, we require that the disk of radius r changes location in a continuous way. Note that both cases require at least m entities to be within the disc at every moment in I. A meeting pattern is defined as follows, see Fig. 1 (b): Definition 2. meet(m, k, r): Given a set of n trajecto- ries of entities in the plane, where each trajectory consists of �� line segments, a meeting in a time interval I, where the duration of I is at least k, consists of at least m entities that stay within a stationary disk of radius r during I (note that m ��� N, k ��� R and r 0 are given constants). We also consider two variants of meetings: either the same m entities stay together during the entire interval (fixed- meet), or the entities in the meeting region change during the interval (varying-meet). In this paper we consider opti- mization variants of the flock and meeting problems: flock(m, max, r)/meet(m, max, r): compute the longest duration pattern. We will show that varying-flock, fixed-meet, and varying- meet can be solved in time polynomial in n and �� , whereas fixed-flock is NP-hard. For all four problems, we also present 35
Page 2
(a) (b) p2 t9 t1 t1 t3 t2 t9 p1 p1 p2 p3 p4 t2 t3 Figure 1: (a) A flock for p1, p2, p3 marked in grey. (b) A meeting for p1 and p2 marked in grey. approximation algorithms by allowing slightly larger radius, i.e., the radius of the reported group may not be bounded by r, but instead it is bounded by cr, where c 1. Consider a longest duration flock F for the flock(m, max, r)-problem and let I denote the time interval during which F is de- fined. A flock F during a time interval I is said to be a c-radius approximation of the flock(m, max, r)-problem, if and only if F consists of at least m entities such that for every point in time within I there is a disk of radius cr that contains all the m entities, and I is at least as long as I . This definition applies to the fixed and varying versions, and is analogous for the meeting patterns. Approximating the radius of the region or subset size for spatio-temporal data mining was first considered by Gud- mundsson et al. [6]. They proposed to approximate the pat- terns: ���Any exact values of m and r hardly have a special significance���20 caribou meeting in a circle with radius 50 meters form as interesting a pattern as 19 caribou meeting in a circle with radius 51 meters.��� In this paper we only con- sider approximations of the radius, not of the subset size. Previous results. From a data mining and database per- spective the research has mainly been focusing on modeling, querying and indexing spatio-temporal data, see for exam- ple [7, 9, 12, 21]. A common approach in database research is to take an existing spatial query type and then study its generalizations to spatio-temporal data, see for example [8, 15]. Recent advances in data mining have been accomplished by using associated rule mining to distinguish regions with high activity [22], for example, sinks, sources and thorough- fares. A different approach was suggested by Laube and Imfeld [13]: the REMO framework (RElative MOtion) con- siders similar behavior in groups of entities. They define several spatio-temporal patterns for trajectories, based on similar direction of motion or change of direction. Laube et al. [14] extended the framework by not only including direc- tion of motion, but also location itself. They defined several spatio-temporal patterns, including flock, leadership, conver- gence, and encounter, and gave algorithms to compute them e���ciently. Among other results they developed an algorithm for finding the largest flock pattern (maximum number of entities) using the higher-order Voronoi diagram with run- ning time O(�� (nm2 + n log n)) they also proved that the detection problem can be answered in O(�� (nm + n log n)) time. Applying the algorithm by Aronov and Har-Peled [1] to the problem gives a (1 + ��)-approximation with expected running time O(�� n log2 n/��2), where the algorithm approx- imates the flock size. Gudmundsson et al. [6] showed that if the disk is (1 + ��)-approximated then the detection problem can be solved in O(�� (n log(1/��)/��2 + n log n)) time. However, the algorithms listed above use a different def- inition of flock than used in this paper. Their definition only considers the entities at one time step, i.e., a set of at least m entities within a circular region of radius r is a flock if they move in the same direction. Benkert et al. [2] argue that this is not su���cient for many applica- tions. Flocks may need many time steps to be properly defined. For example, in some of the aforementioned ap- plications the coordinates of the tracked animals are re- ported every 30 minutes and a group of animals must stay together for days to form a flock. Benkert et al. [2] re- cently used a different approach to compute ���approximate��� flocks. They transform a subpath of length k (assumed to be an integer) of each trajectory into a point in 2k- dimensional space. Then the problem is reduced to find 2k-dimensional balls that contain at least m points. They give a (1 + ��)-radius approximation to fixed-flock(m, k, r) with running time O( n�� k2 m��2k (log n + ��1���2k)). Note however that the optimization version of the flock problem is not considered in [2], and the running time of their algorithm is exponential in k. This paper is organized as follows. In the next section we show that the fixed-flock(m, max, r) problem is as hard as MaxClique to compute and approximate. In Section 3, we study the flock problem further, give a polynomial time, exact algorithm for varying-flock(m, max, r), and approxima- tion algorithms for both variants. In Section 4 we consider the meeting problem and give exact and approximation algo- rithms for both variants. We conclude with future research. Our algorithmic results are summarized in Table 1. 2. HARDNESS RESULTS We start with proving two hardness results, which also motivates why we study c-radius approximation algorithms in the following sections. The first concerns maximizing the subset size of the flock, whereas the second concerns maximizing the duration. Both reductions use MaxClique, which is NP-hard [5], and furthermore, cannot be approxi- mated well. Fact 1. (H�� astad 1999 [10]) For any constant �� 0, MaxClique cannot be approxi- mated in polynomial time within a factor of n1/2����� unless P = NP , and not within a factor of n1����� unless NP=ZPP. Theorem 1. The problem of computing a (2 ��� ��)-radius approximation of fixed-flock(max, k, r), for any 0 �� ��� 1, is NP-hard. Proof. The reduction is from MaxClique. Let G = (V, E) be some graph with n vertices, and suppose we wish to determine whether a clique of size m exists. We construct an instance of the flock problem as follows. Every vertex is represented by an entity. At time step zero, all entities are at the origin of the plane. Assume that r = 1 the instance can be scaled to realize any value of r. We define the locations of all entities at all time steps as follows. For 0 i ��� n and time steps 3i ��� 2 and 3i, we let all entities be at (5i, 0). At time step 3i ��� 1, the entity of vertex vi is at (5i, 2), and all entities of vertices that are not connected to vi in G are at (5i, ���2), as shown in Fig. 2. We easily observe that at time 3i ��� 1, a circle of radius 1 can contain the i-th entity and all entities whose vertices are connected to vi in G, or a circle of radius 1 contains all entities except 36
Readership Statistics
24 Readers on Mendeley
by Discipline
4% Economics
by Academic Status
42% Ph.D. Student
13% Student (Master)
8% Post Doc
by Country
25% United States
21% Germany
17% Switzerland
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


