随着大数据时代的来临,为了高性能地转化海量分布式日志,提出事件日志在云平台上基于MapReduce架构的分布式转化算法。提出基于案例拆分的改进算法,以转化单机上的日志,使其变得可行;进一步提出基于MapReduce的并行转化算法。这是在过程挖掘领域中首次实现从海量原始日志到可扩展事件流事件日志的并行转化,极大地提高了转化性能。
With the coming of big data time, to convert the mass distributed log in high performance, a distributed conversion algorithm of event log based on MapReduce framework was proposed. An improved algorithm based on case split was put forwarded, thus the conversion of log on single machine became feasible. Furthermore, a parallel algorithm based on MapReduce was proposed. In the area of process mining, it was the first time to realize the par- allel conversion from mass original log to eXtensible Events Stream (XES) event log, and the conversion perform- ance was improved extremely.