通过高维时间序列分割可以创建高级符号表示。提出一种针对高维时间序列的无监督分割算法,用于解决高维数据符号化的预处理问题。该算法实现对高维数据的聚类,应用最大熵投票模型进行序列分割。实验结果表明,其平均查全率和查准率分别为0.86和0.88,且整体性能优于主成分分析算法和概率主成分分析算法。
Through the high-dimension segmentation, the high-level symbol expression can be created. This paper proposes an unsupervised segmentation algorithm for high-dimension time series. This method can solve the pretreatment problem of high-dimension symbolization. It realizes the clustering of high-dimension data, and uses max entropy voting model to do series segmentation. Experimental results show that the algorithm's average recall ratio and precision ration are respectively 0.86 and 0.88. Its whole performance is better than Principal Component Analysis(PCA) algorithm and Probability Principal Component Analysis(PPCA) algorithm.