非清洁数据为数据管理带来了新的挑战,当前,处理非清洁的数据清洗方法在实际应用中存在一定的局限性,因此需要在一定程度上容忍非清洁数据的存在。这样,研究管理包含非清洁数据的数据库管理技术就成为了重要的问题,其核心在于如何从包含非清洁数据的数据库中得到满足应用所要求的清洁度的查询结果。从非清洁数据处理角度出发,提出了一种非清洁数据库的数据模型。该模型提出了非清洁数据的表示方法,支持非清洁数据的数据操作,并且支持数据操作清洁度的计算。同时还讨论了查询表达式的等价转换规则和模型的初步实现。
Dirty data brings new challenges for data management. Current methods of dirty data management are mainly data cleaning. Such methods have limitations when dealing with in applications. In some systems, dirty data has to be tolerated. Therefore, the management of databases with dirty data becomes an important issue. The crucial problem is to obtain query result with a clean degree satisfying clean requirement of applications from databases with dirty data. From the aspect of dirty data management, a data model for dirty databases is presented in this paper. This paper proposes the representation of dirty data, data operators for dirty data and the computation method of clean degree of tuples with support of data operation. The equivalent transformation rules for query expressions on dirty data and the preliminary implementation of the data model are also discussed in this paper.