为了解决轨迹数据流中实时查询问题,提出了一种面向实时查询处理的轨迹数据流挖掘框架(trajectory data stream mining framework,TSMF),该框架包括2个部分:在线的轨迹数据流挖掘和离线的实时查询处理。在线部分,首先,对实时接收的轨迹数据作基于密度的线段流聚类,获取到密度聚集的线段簇,然后,在轨迹簇树和蜂群模式哈希表存储索引结构上,根据线段簇结果对轨迹簇和蜂群模式进行在线更新;离线部分,实现了当前关闭轨迹簇(current closed trajectory clusters query,CCTC)、当前关闭蜂群模式(current closed swarm query,CCSwann)和邻居轨迹(k-nearest neighboring trajectory,k-NNT)3种面向移动目标的实时查询处理方法以响应用户的实时查询请求,当用户请求查询时,在实时挖掘出的轨迹簇和蜂群模式中快速查找结果。在大规模真实数据和合成数据上的综合实验验证了TSMF的挖掘效果、高效率性、可扩展性和较高的查询处理速度。
In order to solve the real time query problem in trajectory data stream, a trajectory data stream mining framework (TSMF) facing to real time query processing is proposed ,which contains two parts: online trajectory data stream mining and offline real time query processing. For the online part, we first perform online line segment data stream clustering based on density to obtain line segment clusters for received data stream. Then, according to the line segment cluster results, the trajectory clusters and swarm patterns are updated online based on TCT and SHT storage index. For the offline part,in order to respond to users' real time query request, three real time query pro- cessing methods facing to moving target are implemented, which are current closed trajectory clusters query (CCTC), current closed swarm query (CCSwarm) and k-nearest neighboring trajectory(k-NNT) query. When a user requests to query from trajectory data stream, the query result is quickly reported from the trajectory clusters and swarm patterns discovered in the online part. Comprehensive experiments on large scale real trajectory data and synthetic data demonstrate the mining effectiveness, efficiency, scalability and fast query processing speed of the proposed TSMF framework.