高光谱遥感影像快速独立成分分析(fast independent component analysis ,FastICA )降维过程包含大规模矩阵计算及大量迭代计算.通过热点分析,面向集成众核(many integrated core ,MIC)架构设计了协方差矩阵计算、白化处理和 ICA 迭代等热点并行方案,提出和实现一种 M-FastICA 并行降维算法,并构建算法性能模型;基于集成众核研究并行程序优化策略,针对各热点并行方案提出一系列优化策略,特别是创新性地提出一种下三角阵负载均衡方法,并量化测试其优化效果.实验结果显示M-FastICA 算法最高可加速42倍,比24核 CPU 多线程并行快2.2倍;探讨了波段数与并行程序性能的关系;实验数据验证了算法性能模型的准确性.
There are massive matrix and iterative calculations in fast independent component analysis (FastICA ) for hyperspectral image dimensionality reduction . By analyzing hotspots of FastICA algorithm ,we design the parallel schemes of covariance matrix calculating ,whitening processing and ICA iteration on many integrated core (MIC ) ,implement and validate an M-FastICA algorithm . Further ,we present a performance model for M-FastICA . We present a series of optimization methods for the parallel schemes of different hotspots : reforming the arithmetic operations , interchanging and unrolling loops ,transposing matrix ,using intrinsics and so on .In particular ,we propose a novel method to balance the loads when dealing with the lower triangular matrix .Then we measure the performance effects of such optimization methods .Our experiments show that the M-FastICA algorithm can reach a maximum speed-up of 42X times in our test ,and it runs 2 .2X times faster than the CPU parallel version on 24 cores .We also investigate how the speed-ups change with the bands .The experiment results validate our performance model with an acceptable accuracy and thus can provide a roofline for our optimization effort .