在基于多维数据的分析中,分析人员面对的经常是庞大的数据立方体.联机分析处理虽然提供了灵活的展现和分析功能,却只能进行假设驱动的探查,很容易忽略重要信息.而已有的发现驱动的探查是基于局部异常的导航,容易受数据噪声的干扰.针对这些问题,提出了一种新的导航方法——异常分布驱动的导航.这是一种有效的辅助探查数据立方体的方法,可以循序渐进地引导用户至信息量大的数据部分.它将维和维成员作为探查数据立方体的脉络,基于数据分布特征为各个维和所有维成员计算奇异度,作为用户探查数据立方体的导航符.实验结果表明此导航方法是实用有效的.
Although OLAP (on-line analytical processing) provides various kinds of explorational and analytical functions, the analysts may ignore important information based on hypothesis-driven exploration in a large search space. Moreover, the existing discovery-driven exploration is based on exceptional cells which can be easily affected by noise. Cube navigation is an effective method which can induce an analyst to the most surprising parts of the cube. To overcome this problem, a new navigation method is proposed, which regards dimensions and dimensional members as skeleton of the data cube, Through extracting the distribution feature of the corresponding data set, the dimensions and their members are assigned proper surprising values as the navigation light. Experiments prove that the method is practical and effective.