通过改变CR算法的计算次序。提出了一种改进的共轭剩余(ICR)算法.对比CR算法。ICR算法的数值稳定性和CR算法相同,几乎没有增加计算量。但考虑了在MIMD并行机上实现时并行算法的性能,其同步开销减少为CR算法的一半,并且所有内积计算以及矩阵向量乘是独立的,没有数据相关性。可以进行计算与通信的重叠.从理论和实验两个角度来讨论ICR算法的性能,当处理机台数较多时ICR算法的计算速度快于CR算法.在64台处理机机群上进行的数值实验表明,并行ICR算法的计算速度大约比CR算法快30%.
The conjugate residual (CR) algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for symmetric linear systems with very large and very sparse coefficient matrices. By changing the computation sequence in the CR algorithm, this paper proposes an improved Conjugate Residual (ICR) algorithm. The numerical stability of ICR algorithm is same as CR algorithm, but the synchronization overhead that represents the bottleneck of the parallel performance is effectively reduced by a factor of two. And all inner products of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. From the theoretical and experimental analysis it is found that ICR algorithm is faster than CR algorithm as the number of processors increases. The experiments performed on a 64-processor cluster indicate that ICR is approximately 30G faster than CR.