针对高维特性对多元时间序列数据挖掘过程和结果的影响,以及传统主成分分析方法在多元时间序列数据特征表示上的局限性,提出一种基于变量相关性的多元时间序列数据特征表示方法.通过协方差矩阵描述每个多元时间序列的分布特征和变量相关关系,利用主成分分析方法对综合协方差矩阵进行主元分析,进而实现多元时间序列的数据降维和特征表示.实验结果表明,所提出的方法不仅能提高多元时间序列数据挖掘的质量,还可以对不等长多元时间序列进行快速有效的挖掘.
The property of high dimensionality impacts on the process and results in the field of time series data mining,and the traditional methods about principal component analysis have some limitations to represent multivariate time series. Therefore, a feature representation of multivariate time series based on correlation among variables is proposed.The distribution and relationships among variants of every time series are described by the covariance matrix, and principal components are extracted from an integrated covariance matrix by principal component analysis. In this way, the dimensionality of multivariate time series can be reduced and the features can be represented. The experimental results show that the proposed method not only improves the quality of multivariate time series data mining but also efficiently mines on the data with different lengths.