如何有效地对数据进行布局是大规模网络存储系统面临的重大挑战,需要一种能够自适应存储规模变化、公平有效的数据布局算法.提出的CCHDP(clustering—based and consistent hashing—aware data placemem)算法将聚类算法与一致hash方法相结合,引入少量的虚拟设备,极大地减少了存储空间.理论和实验证明,CCHDP算法可以按照设备的权重公平地分布数据,自适应存储设备的增加和删除,在存储规模发生变化时迁移最少的数据量,并且可以快速地定位数据.对存储空间的消耗较少.
Large-Scale network storage systems are confronted with the big challenge of efficiently distributing data among storage devices. It's necessary to design an efficient, fair and adaptive data placement algorithm. This paper has developed an algorithm CCHDP (clustering-based and consistent hashing-aware data placement) to distribute data over heterogeneous devices in the systems. It combines clustering algorithm and consistent hashing, saving much memory space by avoiding extra virtual devices. The analysis and experiments show that CCHDP can notonly assign data evenly among devices and adapt well with the additions or departures of devices for the number of data moved is nearly equal to the optimal amount in the events of devices changes. Moreover, CCHDP is time efficient with little memory overhead.