针对奇异值分解(SVD)分析偏好特征不够准确,有时出现不可解释的情况,文中提出利用行列联合选择(CUR)矩阵分解方法获取原始矩阵M(用户对产品的偏好)的低秩近似,提取用户和产品的潜在偏好.首先计算M中行和列的统计影响力得分,并抽取得分较高的若干列和若干行构成低维矩阵C和R,然后由M、C、R近似构造矩阵U,将高维空间中的偏好特征提取问题转化为低维空间中的矩阵分析问题,使其具有较好的可解释性和准确性.最后,通过理论分析和实验发现,与传统分解方法相比,CUR矩阵分解方法在偏好特征提取方面具有更高的准确度、更好的可解释性及更高的压缩率.
Preference features can not be accurately analyzed and explained by singular value decomposition. Aiming at these problems, a column union row (CUR) matrix decomposition method is proposed to acquire a low-rank approximation of the original matrix M ( user preferences for products) and extract the potential preferences of users and products. The statistics leverage score of matrix M is calculated firstly. And then, several rows and columns with higher scores are extracted to constitute low-dimensional matrix C and matrix R. Subsequently, the matrix U is constructed approximatively according to matrix M, C and R. By the proposed method, the extraction problem of preference feature in a high-dimensional space is transformed to the matrix analysis problem in a lower dimensional space. As a consequence, the CUR decomposition has better accuracy and interpretability. Finally, the theoretical analysis and experiment indicate that compared with the traditional decomposition methods, the CUR matrix decomposition method has higher accuracy, better interpretability and higher compression ratio for extracting preference feature.