活动轨迹的近似查询是在带关键词信息的轨迹集中,检索与查询点集距离最近且满足查询点集关键词要求的活动轨迹的过程。因为GAT(Grid index for Activity Trajectories)不能查询海量活动轨迹,将GAT扩展到适用于海量活动轨迹的近似查询技术GATH(GAT on Hadoop)。和GAT相比,GATH使用两种新的索引结构进行剪枝;其网格索引依照海量数据的特点从底层单元格开始进行基于空间的剪枝;其倒排索引用于进行基于关键词的剪枝。实验结果证实GATH比GAT能有效缩短索引建立时间及提高剪枝效率。
Given a sequence of query locations, each associated with a set of key activities, an activity trajectory similarity query returns k trajectories that cover the query activities and yield the shortest minimum match distance. Since GAT(Grid index for Activity Trajectories)is not for big data, it introduces a new structure GATH(GAT on Hadoop)to solve the problem of similarity search on massive activity trajectories. Moreover, GATH uses grid index for space pruning and inverted index for keyword pruning. The experimental results demonstrate that GATH is more efficient for both index building and data pruning than GAT.