服务器系统最无法忍受的就是因为频繁出错甚至崩溃影响正常用户的运行,因此需要系统具有自恢复能力.目前研究应用较多的自恢复策略即回滚检查点策略,并不适用于多用户服务器程序的恢复.针对多用户服务器程序的特点,设计了一种基于虚拟机的自恢复系统VMSRS(virtual machine monitor-self recovery of service program).VMSRS的基本思想是以虚拟机监控器为恢复主体,充分利用虚拟机作为第三方底层系统以及硬件资源的管理监控者这些特点所带来的优势,严格保证用户数据一致性、数据元数据操作原子性、恢复数据安全隔离性等:同时应用改进的SRS(self recovery of service program)思想,在错误发生时不进行回滚,控制错误不让其影响正常用户,并保证正常用户和服务器可以顺利地向前运行,就像没有错误发生一样;并利用系统本身和VMSRS的清理机来避免回滚.研究工作设计实现了包括抑制错误、请求恢复、监控、存储管理等模块在内的自恢复系统VMSRS,主要针对多用户服务器系统中的内存错误来进行恢复.通过对基本功能、基本性能、整体功能的实验分析表明,VMSRS在不进行回滚、保证性能的前提下,提供了良好的恢复数据安全性以及完善的用户状态数据一致性保证,可以很好地恢复多线程程序,不需要对线程进行任何限制.同时,该研究工作也为在虚拟化技术条件下研究设计自恢复系统进行了很好的实践和探索.
Long running multi-user server system may encounter frequent errors resulting in running disruptions due to its complexity of program, operating environments and user operations. This poses the need of self-recovery of system. Rollback and checkpoint scheme is a popular self-recovery strategy in current research and application, but has no obvious effects in multi-user system. In this paper, a VMM-based self-recovery system named VMSRS (virtual machine monitor-self recovery of service program) is designed according to the characteristics of multi-user server programs. The main idea of VMSRS is regarding VMM as major component of recovery, taking advantage of VM as independent underlying system and hardware resource monitor, and strictly maintaining the consistency and security of user data and atomicity of data operation. As an improved SRS (self recovery of service program), VMSRS controls errors to avert affecting normal users in case of system crash instead of committing rollback, allowing users and servers to proceed as if no crash happens. Rollback is avoided by taking advantage of self-cleansing mechanism of system and VMSRS. The issues addressed by VMSRS design include crash suppression module, demand driven restoration module, monitor module, and storage management module. The experiment results from analyzing basic function, basic performance and integral function validate that VMSRS can provide favorable security and consistency of user data while guaranteeing performance and committing no rollback. It recovers multi-thread programs excellently with no limit to threads. Meanwhile, this exploratory study also takes part in current research of self-recovery system utilizing virtualization technology.