将并行计算的策略引入到时间序列处理中,提出基于Map/Reduce的时间序列相似性搜索算法,充分利用云计算可进行大规模计算和数据处理的特点,有效降低了时间序列相似性搜索中运算量,简化了计算过程。该算法在心电图数据集上进行相似性搜索,分别进行PAA下界过滤和DTW距离的计算,验证运算时间和并行加速比随节点变化的情况,与传统的单机运算相比,有效地提高了时间序列挖掘效率。
The strategy of parallel computing was introduced into time series processing,and time series similarity searching algorithm based on Map / Reduce was proposed. The proposed algorithm could make use of the features of cloud computing to take large-scale computing and data processing,and could efficiently reduce the large calculation and simplify the computing process of time series similarity searching. The proposed algorithm was adopted on electrocardiograph dataset to complete similarity searching with piecewise aggregate approximation lower bound and dynamic time warping distance,which verified the effect of nodes changing on operation time and parallel speed up. Compared with the traditional one running on single PC,the proposed algorithm improved the efficiency of time series mining effectively.