针对智能调度系统中的大数据集信息压缩处理问题,利用Hadoop框架和Map/Reduce分布式编程模型,提出了一种基于云架构的无损集群压缩新方法.对字典编码和统计编码的无损压缩进行了分类比较,利用云计算节点的集群网络配置方式进行调度主极和监控服务器的部署,在集群数据节点中融入无损压缩,建立调度监控信息的无损集群压缩实验环境.利用调度端的断面量测记录进行测试研究,得出:对于相同断面记录集的无损压缩,BZip2格式的集群压缩比优于Deflate和Gzip格式.对不同断面记录集的BZip2集群压缩结果表明:在断面记录超过3×106以上时,压缩比达到81.1%,相对传统无损压缩方法提高30%以上.
Aimed at the data sets compression problem in the intelligent dispatching system,using the Hadoop framework and Map/Reduce distributed programming model,a novel method of lossless cluster compression based on cloud framework is proposed.Dictionary coding and statistics coding of lossless compression are compared and classified,with the scheduling machine and monitoring server deployed according to the cluster network configuration of cloud cluster nodes.Lossless compression is incorporated in the cluster data nodes to build a lossless cluster compression experiment environment of dispatching monitoring information.It has been found from the dispatching section measurement that,for the lossless compression of identical section log set,the BZip2 format cluster ratio is better than that of the Deflate and Gzip formats.The test results of different compression formats on the same section sets show that the compression ratio reaches 81.1% when section sets exceed three million,which is an increase of over 30% compared with the traditional lossless compression ratio.