直接采用风速、温湿压等气象参数原始时间序列对其进行短期预测、相似匹配、分类聚类等数据挖掘工作不但效率低下,而且会影响时间序列数据挖掘的准确性和可靠性。提出了一种简单快速的基于特征点的筛选算法对时间序列进行分段线性表示。对气象参数等时间序列进行实验,并就计算性能和拟合误差与另外一种序列分段算法进行了对比分析,结果表明该方法能有效地提取序列的主要形态,同时降低对于阈值的依赖,具有计算代价小、快速方便、通用性强等特点,在气象数据压缩上具有较好的应用前景。
It is not only inefficient to use the raw time series of meteorological parameter such as temperature refractive index structure parameter, wind speed and temperature to make short-term prediction, query similarity and classify and cluster time series, but also affects accuracy and reliability of data mining of time series. This article proposes a simple and fast method which based on the election of extrema point and tendency turning point to make the piecewise linear representation of time series. The method can extract the main pattern of series effectively, and reduce the dependency of threshold. It has the characteristic of small cost of computing, efficient and convenient and strong commonality. Then based on that, the experiments on temperature refractive index structure parameter and other kinds of meteorological parameter are implemented and conduct the comparison analysis between the method and another kind of sequence segmentation algorithm. The result shows that the method proposed is capable of reflecting the pattern of time series effectively and accurately.