在多维数据分析和处理中,经常会出现部分数据丢失或者部分数据未知的情况,如何利用已知数据的潜在结构对这些缺失数据进行填充是一个亟待解决的问题。目前对于缺失数据填充的研究大多是针对矩阵或者向量形式的低维数据,而对于三维以上高维数据填充的研究则很少。针对该问题,提出一种基于张量分解的多维数据填充算法,利用张量分解中CP分解模型的结构特性和分解的唯一性,实现对多维数据中缺失数据的有效填充。通过实验对以三维形式存储的部分数据缺失图像进行填充修复,并与CP-WOPT算法进行比较,结果表明,该算法具有较高的准确度以及较快的运行速度。
On the multi-dimensional data analysis and processing, data with missing or unknown values is ubiquitous. How to use the potential structure of the known data to reconstruct the missing data is an urgent problem to be solved. Previously, the missing data filling mostly aims at low-dimensional data in matrix or vector format, while research on high-dimensional data above 3D is very few. To solve this problem, this paper proposes a multi-dimensional data filling algorithm based on tensor decomposition, adequately using tensor decomposition’s structure and uniqueness of CP model, to realize the multi-dimensional data filling effectively. Filling image with missing data stored in 3D format by experiment and comparison with CP-WOPT algorithm, it proves that this algorithm is not only accurate but also rapid.