当前数据中心广泛采用虚拟化、混合存储等技术以满足不断增长的存储容量和性能需求,这使得存储系统异构性变得越来越普遍.异构存储系统的一个典型问题是由于设备负载和服务能力不匹配,使得存储系统中广泛使用的条带等并行访问技术难以充分发挥作用,导致性能降低.针对这一问题,提出了一种基于负载特征识别和访问性能预测的缓存分配算法(access-pattern aware and performance prediction-based cache allocation algorithm,Caper),通过缓存分配来调节不同存储设备之间的I?O负载分布,使得存储设备上的负载和其本身服务能力相匹配,从而减轻甚至消除异构存储系统中的性能瓶颈.实验结果表明,Caper算法能够有效提高异构存储系统的性能,在混合负载访问下,比Chakraborty算法平均提高了约26.1%,比Forney算法平均提高了约28.1%,比Clock算法平均提高了约30.3%,比添加预取功能的Chakraborty算法和Forney算法分别平均提高了约7.7%和17.4%.
The scale of storage system is becoming larger with the rapid increase of the amount of produced data. Along with the development of computer technologies, such as cloud computing, cloud storage and big data, higher requirements are put forward to storage systems: higher capacity, higher performance and higher reliability. In order to satisfy the increasing requirement of capacity and performance, modern data center widely adopts multi technologies to implement the dynamic increasing of storage and performance, such as virtualization, hybrid storage and so on, which makes the storage systems trend more and more heterogeneous. The heterogeneous storage system introduces multiple new problems, of which one key problem is the degradation of performance as load unbalance. That's because the difference of capacity and performance between heterogeneous storage devices make the parallelism technologies hardly to obtain high performance, such as RAID, Erasure code. For this problem, we propose a caching algorithm based on performance prediction and identification of workload characteristic, named Caper (access-pattern aware and performance prediction-based cache allocation algorithm). The main idea of Caper algorithm is to allocate the load according to the capacity of the storage devices, which aims to alleviate the load unbalance or eliminate the performance bottleneck in the heterogeneous storage systems. The Caper algorithm is composed of three parts: prediction of performance for I/O request, analysis of caching requirement for storage device, and caching replacement policy. The algorithm also classifies the application workload into three types: random access, sequential access, and looping access. In order to ensure high caching utility, the algorithm adjusts the size of logic cache partition based on the analysis of the caching requirement. Besides, in order to adapt to the heterogeneous storage system, the Caper algorithm improves the Clock cache replacement algorithm. The experimental results also s