由于时间序列的长度很大,并且不确定时间序列在每个采样点的取值具有不确定性,导致时间序列在相似性匹配和聚类挖掘中时间复杂度很高,为了解决该问题,提出了基于趋势的时间序列相似性度量方法和聚类方法。其中基于趋势的相似性度量方法根据时间序列的整体变化趋势,将时间序列映射为短的趋势符号序列,并利用各趋势的一阶连接性指数和塔尼莫特系数完成相似性度量;基于趋势的聚类方法通过定义趋势高度,并对趋势符号序列迭代进行区间划分和趋势判断,并以此构建趋势树,最后将趋势树根节点中趋势符号相同的序列聚集为一类。实验结果表明:a)五种趋势符号的一阶连接性指数可唯一地表示一条时间序列;b)基于趋势的相似性度量方法在多项式时间内可有效完成时间序列的相似性匹配;C)基于趋势的聚类方法将序列的相似性度量和聚类过程集中在一起,聚类效果显著。
Because the length of uncertain time series is very long, and the values in each observation point is uncertain for uncertain time series, it arouses problem for time series similarity measure and cluster. To solve the problem, this paper pro- posed new methods based on trend of time series. The similarity measure mapped time series to short trend symbol series, then introduced connectivity index and Mitani coefficient to realize similarity measure for time series. The cluster method judged the trend of trend symbol iteratively by defining height of trend symbol, until reduced to one trend symbol, then it clustered time series to a class with identical trend symbol. Experiment proves that methods put forward can accomplish similarity measure and cluster effectively and own low time cost, in addition, it can use five trend symbols connectivity index to unique represent a time series.