由于高维OLAP数据集包含的信息量过大并且质量参差不齐,导致用户在查询时难以选取合适的维度集合进行操作,从而影响了决策的效率和准确性。为此,提曲将变量选择方法应用于OLAP查询推荐的过程中。为了在包含海量高维信息的OLAP仿真数据集合中识别与度量属性无关的噪声属性及彼此之间存在相关性的维度属性,从而缩小查询范围,同时保持度量属性空间划分结果的准确性,基于非参数方法设计了,一种用于支持OLAP查绚推荐的变量选择算法FFTB,构建了基于变量选择的OLAP查询推荐仿真模型,通过启发式方法发现与查询目标密切相关的维度,并对OLAP查绚的数据环境及查绚稚存过程进行了详细的仿真实验,验证了方法的可用性与有效性。仿真实验显示,变量选择方法能够在保证准确性的前提下有效地缩小OLAP查询空间,从而有效辅助决策者从大量数据中选取关键维度,达到OLAP查询推荐的目的,进而提高决策效率。
In multi-dimensional OLAP data set, there is too much information, and meanwhile the data quality is not at the same level. Due to these features of the data set, it is hard to choose the proper dimensions to operate OLAP queries, which reduce the efficiency and accuracy of decision making. To solve this problem, variable selection was introduced to OLAP query recommendation. The criteria of variable selection in OLAP query recommendation was to recognize the noise attributes which were uncorrelated to measure attributes and the correlated dimension attributes in the condition of space partitioning accuracy. A nonparametric variable selection algorithm FFTB was proposed to build the simulation model for OLAP query recommendation, by which the heuristic idea was used to recognize those dimensions closely related to the query objectives. In order to verify the availability of the simulation model, the data environment and the query recommendation procedure were simulated in the simulation experiment. The results of the experiment reveal that by this model, the OLAP query space is drastically reduced, which is helpful to recognize the key dimensions so as to improve the decision efficiency.