多维数据流相关性分析的研究较少,且主要集中在单一滑动窗口分析.文中提出一种基于基窗口的在线典型相关分析算法(Base—win—CCA).算法动态维护基窗口的统计量用于多维相关性分析,时空复杂度大为减少,并且可根据多用户并发请求获取多个窗口范围的相关性,较灵活,运算结果精确.理论分析和实验结果表明算法在基窗口越大,相关性查询窗口越大,数据流条数越多,查询用户越多的情况下能体现出优越的性能.
Multidimensional data stream analysis is seldom studied, even the minor contribution is mainly from the analytical works on a single sliding window model. An on-line correlation analysis algorithm called Base_win_ CCA algorithm is presented, which significantly reduces space and time complexity by performing simultaneous correlation analysis on multidimensional data streams. Technically, the algorithm achieves the correlation of multiple windows in a flexible and accurate way by dynamically maintaining statistics data. Theoretical analysis and experimental results indicate that the proposed algorithm is remarkable in performance when the window is larger with sufficient data streams and users.