针对基于FPGA实现解线性最小二乘问题存在的计算并行性差和计算延迟大的问题,提出基于改进Cholseky分解解线性最小二乘问题的FPGA计算方法。该方法将最小二乘问题转换为矩阵分解和三角阵求解两部分实现,在每个部分通过最大化PE单元数量提高运算的并行性。在矩阵分解部分采用改进的Cholesky分解方法规避开方运算,并将除法运算转换为乘法,减小计算延迟。同时,在三角阵求解部分通过计算结构复用实现正三角和倒三角线性方程组的求解,提高资源利用率。在Xinlinx Virtex XC5VFX130T平台上的实验结果表明,在单精度条件下,相对于PC平台,该方法能够实现8倍以上的效率提升。
Large calculation delay and poor parallelism greatly limit the solution efficiency of least square problem based on FPGA.We propose a novel approach of modified Cholesky factorization to solve this problem.With this approach,the least square problem is divided into matrix factorization part and triangle matrix solving part.The optimal parallelism is achieved by maximizing the amount of PEs(Processing Element) in each part.The calculation delay is decreased by avoiding the root operation and eliminating the division operation with modified Cholesky factorization.In triangle matrix solving part,the same PEs are used to solve both the upper triangle matrix and lower triangle matrix,which saves the FPGA resources.The experiments on Virtex XC5VFX130T FPGA with a 100 MHz clock show a speedup of 8× over a dual core CPU implementation in single-precision.