物化视图选择问题是数据仓库设计中最重要的问题之一,为了高效地解决这一问题.提出了一个如何选择物化视图集的增强遗传算法,以便在存储空间约束的条件下,取得较好的查询性能和较低的视图维护代价.这一算法的核心思想在于,首先,运用一个基于单位空间最大收益值的预处理算法来生成初始解,然后,该初始解经采用了多种优化策略的遗传算法进行提高,这些优化策略包括:基于改进的锦标赛和精英选择相结合的选择算子、基于半均匀交叉算子及自适应变异算子.并且,在进化过程中产生的无效解用损失函数加以修补.试验结果表明,该算法在寻优性能上优于启发式算法和经典遗传算法.
Materialized view selection problem is one of the most important decisions in designing a data warehouse. In order to efficiently solve the problem, a modified genetic algorithm for how to select a set of views to be materialized so as to achieve both good query performance and low view maintenance cost under a storage space constraint is proposed. The core idea of the algorithm is as follows. First, a pre-process algorithm based on the maximum benefit per unit space is used to generate initial solutions. Then, the initial solutions are improved by genetic algorithm having the mixture of optimal strategies, such as selection operator based on combination of elitism and modified k-tournament method, crossover operator based on Hamming Distance and self-adaptive mutation operator. Furthermore, the generated infeasible solutions during the evolution process are repaired by loss function, The experimental results show that the proposed algorithm outperforms heuristic algorithm and canonical genetic algorithm.