现有的流程挖掘技术和工具都是针对单个日志文件,但在实际业务环境中,一个业务流程的执行往往需要多个信息系统共同支撑,信息系统产生的流程日志数据将被分布在不同的日志文件中,有必要对这些分散的日志数据进行融合,以供对全局流程的挖掘分析.文中提出了一种基于模拟退火与人工免疫混合算法的日志融合方法.该方法针对跨IT系统的流程日志特点,在亲和度计算中考虑了流程路径出现频次和实例时间重叠区域两个算子,以提高匹配实例的准确度和方法的实用价值;在种群进化中引入模拟退火选择思想,以解决人工免疫算法早熟和持续退化的问题,并加入了记忆库机制,加强每一代种群的多样性保持,避免种群局部收敛.实验结果表明:文中方法的日志融合成功率达90%以上,能保证流程挖掘结果的正确性;与传统基于人工免疫的日志融合方法相比,文中方法的收敛速度明显提升,提高了融合效率.
The existing process mining techniques and tools are on the basis of a single log file. In actual business process environment, however, a business process may be supported by different computer systems, so that actual process data will be recorded into multiple log files. Therefore, it is necessary to merge the multiple recorded data into one log file for further global process mining and analysis. In this paper, an automatic method is proposed to merge event logs by combining an artificial immune algorithm and simulated annealing. In the method, on the basis of the characteristics of the process logs of multiple IT systems, two operators, namely, the occurrence frequency of activity sequences and the time overlap area between mergeable cases, are taken into account in an affinity func-tion ,so as to improve the accuracy of matching cases and the practicality of the proposed method. Moreover, the simulated annealing selection is introduced into the evolution of populations so as to solve the problems of the pre-mature and continuous degradation of artificial immune algorithm, and the immunological memory is also introduced to preserve the diversity of populations and avoid their local convergence. Experiment results show that the proposed method achieves a merging success rate of more than 90%, and it can ensure that process mining results are cor-rect ,and that, as compared with the traditional log data-merging method on the basis of artificial immunity, the proposed method speeds up convergence significantly and increases merging efficiency.