为解决异构数据整合过程中数据源本身的质量及目标数据的实时更新问题,在适配器、XML和逆向清理等技术的基础上,提出一种基于逆向清理的异构数据整合模型。从两方面对异构数据进行处理,一方面利用实时线程对新增或修改的原始数据进行抽取、清洗并保存,达到数据的实时更新,另一方面利用平台上或整合后的有效数据,采用逆向清理过程反向修复原始数据中的错误和缺失。实验结果证明,该模型能同时提高原始数据和目标数据的质量。
In order to solve the problems of target data updated in real time and the quality of data source itself in the process of heterogeneous data integration,on the basis of the adapter,the XML and reverse data cleaning technology,a real time heterogeneous data integration model based on reverse data cleaning is presented.It processes heterogeneous data in major two ways.On the one hand,it uses real time threads to extract,clean and save the original data that is newly increased or modified.On the other hand,it uses the reverse cleaning process reverse to fix errors and missing in the original data by the valid data in platform or integration.Experimental result shows that the model can improve the data quality of the target data and the original data simultaneously.