信息生命周期管理是一种可持续发展的存储策略,以其独特的优势在存储成本和高效信息管理之间取得平衡.本文设计并实现SAN、NAS存储网络下基于分级存储(HSM)的清华信息生命周期管理系统(THILM).提出一种新的文件重解析实现技术,实现ILM系统中HSM应用服务器端和系统应用层对动态存储数据的透明访问和多级存储数据的统一管理.讨论通过对应用元数据管理的优化和策略缓存(Policy Cache)技术的改进实现高效的管理策略的执行.通过系统性能测试表明本文提出的文件重解析实现技术带来的额外负载几乎可以忽略,策略缓存技术使系统的策略执行性能提高了一个数量级.另外针对原策略缓存初始化性能低下的问题,采用策略元数据池(PMC)进行优化,采用PMC后THILM策略缓存的初始化时间降低了16~20倍.
Information lifecycle management is a sustainable development storage policy, which balance the storage cost and manage depending on its characters. This thesis presents the design and implementation of the Tsinghua ILM based on hierarchical storage in SAN and NAS environment. A new implementation of file-reparse is presented to resolve the user directly access file system problem and consolidate the data in different hierarchies. It brings little overhead to the system and could be ignored after performance evaluation. The optimization of the management of application-level metadata and policy cache technology is presented at the end of the paper. It is demonstrated from the performance evaluation that policy cache technology improved the performance of migration policy execution by an order of magnitude. Besides, Policy Metadata Container is designed to improve the poor performance in the initialization process of the policy cache. The time spent on initialization decreased by 16 to 20 times due to the performance evaluation.