建立了一个通用、可扩展的数据清洗系统,从设计思想出发逐步设计了整个清洗系统,并通过工作流引擎管理清洗系统的工作流程,工作流程在执行过程中逐层调用清洗服务、清洗组件并配合知识库完成数据的清洗操作。最后用具体的应用描述了清洗系统是如何按照定义的工作流程完成数据清洗的。该方法设计的清洗系统已成功应用于某市民政局共享平台项目中,实践结果表明,该系统有良好的性能与应用价值。
A data clean system is built which is universal, and can be extend easily. A whole process and concept of how to build this data clean system is presented. In this system, the clean process is managed by work flow engine. The work flow executed a data clean process by clean service and knowledge component. Finally, how the data clean tool clean data through a really scene is described. The system is used in civil affairs department's data sharing project, and is proved to have very good value of application.