为了改善HDFS中集群负载的均衡性,提高数据节点的资源利用率,提出了一种改进的数据存放策略。在HDFS原始策略的基础上充分考虑了节点的差异性,通过采集数据节点的CPU负载、内存负载、网络负载和存储空间负载,引入负载因子计算数据的存放成本,以选择一个最优的数据节点存放数据。实验结果表明,改进后的策略优化了集群负载的均衡性,提高了数据块副本的传输性能。
To improve the cluster load balancing of HDFS and the resource utilization of Datanodes, an improved data replace- ment strategy is proposed. The differences of Datanodes on the basis of HDFS original strategy are considered fully. Selection costs are calculated by collecting CPU load, memory load, network load and storage load, and load factor is introduced to select the most appropriate Datanode to store data. Experiments show that it optimizes the balancing of cluster load, and improves the transmission perforrnanee of data block replicas.