倾斜问题是并行系统普遍存在的问题,对系统的性能影响很大.事件流数据库作为数据流应用的后端分析处理系统具有连续大量的事件流加载与用户查询并存的特点,传统的解决数据倾斜的方法无法适应其动态加载的特点.以主干网的网络安全监控应用为研究背景,结合事件流负载特征,针对基于无共享结构的事件流并行数据库提出了一种基于周期计数的能力感知加载均衡策略.该方法在保证加载性能的同时,可以根据加载节点的能力在线自动调解数据分布,不仅有效预防和解决了系统倾斜,还为查询服务的性能奠定了良好的基础.模拟分析和真实测试都证明这种加载均衡策略较其他策略更有效.
Skew is one of the most important problems in parallel systems, which has a great impact on the parallel systems performance. The event stream system is the back-end data processing and analysis systems of data stream management systems (DSMS). It is different from the traditional database systems due to the new workload characterization. This kind of systems receive continuous, fast-coming and large volume of event stream data on one side, and supply quick response to the users' queries on the other side. Under such a condition the common data redistribution solutions to data skew are not suitable any longer. In this paper a periodical counting based capability aware (PCCA) loading strategy is presented based on DBroker, which is a shared nothing event stream parallel database System for the backbone network monitoring application. This loading strategy not only keeps the event stream data being loaded fast and correctly, but also recognizes and prevents the system from the skew automatically, according to the loading capability of each node adaptively. What's more, it forms a good data distribution foundation for query service. Finally PCCA loading strategy is proven to provide much better performance than the other three methods in both simulation model analysis and real system testing.