为了实现对实时网络数据流的快速分析,设计一种分布式实时数据流分析系统(DRDAS),能有效解决并发访问数据流的收集、存储和实时分析问题,为大数据环境的网络安全检测提供了一种有效的数据分析平台;根据Spark Streaming运行的原理设计一种动态采样的K-Means并行算法,与DRDAS结合能实时有效地检测大数据环境下的各种分布式拒绝服务(DDo S)攻击。实验结果显示:DRDAS具有好的可扩展性、容错性和实时处理能力,与动态采样的K-Means并行算法结合能实时地检测各种DDo S攻击,缩短了攻击的检测时间。
In order to realize the rapid analysis of massive real-time data, a Distributed Real-time Data Analysis System (DRDAS) was designed, which resolved the collection, storage and real-time analysis for mass concurrent data. And according to the operation principle of Spark Streaming, a dynamic sampling K-means parallel algorithm was proposed, which could quickly and efficiently detect all kinds of DDoS ( Distributed Denial of Service) attacks. The experimental results show that the DRDAS has good scalability, fault tolerance and real-time processing ability, and along with new K-means parallel algorithm, the DRDAS can real-time detect various DDoS attacks, and shorten the detecting time of attacks.