为了使网格监测系统能够自主地处理自身的故障、应对复杂多变的网格环境,基于自主组织、管理机制和自主计算技术中的自修复、自愈等思想,R-Net监测系统(RNMS)对监测系统故障的探测、诊断、处理和恢复进行了研究,并把重点放在节点组的故障恢复上.实现了网格监测系统故障诊断与恢复机制,使得监测系统能够无需管理人员的参与、自主地处理一些常见的故障,降低了系统维护的复杂性和成本,增加了系统的易用性和可靠性.
Based the autonomic management mechanism and self-healing, self-repairing thinking of autonomic computing, R-Net monitoring system (RNMS) researches the fault probing, diagnosis, handling and recovery of monitoring system in order to make monitoring system handle autonomically the fault itself and reply to the complicated and changeable Grid environment. RNMS focuses on the recovery of node group from faults. The Grid monitoring system fault diagnosis and recovery mechanism can make RNMS manipulate some familiar faults without intervention from managers. Those lower the maintenance complicacy and cost, and also make the system more reliable and easy to use.