数据仓库(Data Warehouse,DW)是支持决策管理过程的、面向主题的、集成的、随时间而变的、持久的数据集合,它集成了源数据库(Database,DB)和其他平面文件系统的相关数据,以支持决策管理活动。数据仓库结构是一个多维模型,主要分星形、雪花星和星座形三种。有别于传统关系型数据库的设计,数据库仓库的设计往往以数据作为驱动,其设计的好坏,直接影响了数据仓库系统的建设与应用。文中提出了一种评估数据仓库设计质量的定量分析方法,主要是量化了和数据仓库有血缘关系的数据源的相关质量指标,就是通过分析所选取的表和属性两方面的数据质量(Data Quality,DQ)指标,最终联合这些指标以计算数据仓库设计质量的评估值。其中,分析数据源相关质量指标的过程,也可以支持数据库仓库的设计。
Data warehouse which embraces operation database and other relative data from flat files is subject-oriented, integrative, updated and long-lasting and can support decision making. Data warehouse is a multi-dimension model. There are star,snowflake and constellation, all together three kinds of data warehouse. It is different from traditional relation database, the design of data warehouse is datadriven. And the design can directly affect the establishment and application of the data warehouse. It proposes a methodology of quantita- tive analysis to evaluate the design quality of data warehouse. A way is found to quantify the quality index of the data source which is relative to the data warehouse. The value of the data warehouse is assessed by analyzing the data quality from selection tables and attribute. Through analyzing data quality of data source, the quality design of data warehouse can be evaluated.