为了对系统中的潜在故障进行有效地预测,提出一种基于统计测试的非监督故障预测方法;首先,将云服务系统定义为运行在相同的软/硬件环境下,具有相同输入数据的并行系统;在数据预处理过程中,对性能计数器中的数据进行标准化,并选取了一定分位数下的计数器数据信息;最后根据具有相同软/硬件环境和输入数据的节点将产生相同的输出这一原则提出了一种统计测试方法用于系统故障的预测;实验表明,文章提出的基于统计测试的故障预测方法与其它相关算法相比,具有预测准确性高和执行效率快等优点.
Predicting faults of a Cloud services system before it fails can win time for system operators and other recovery mechanisms, and thus improve the quality of services. In order to predicting the latent faults efficiently for such systems, this paper proposed a statistical test based unsupervised fault predicting approach. First, we defined the Cloud service system as a parallel system running in the same soft ware and hardware environment, and with the same input data. During the process of data preprocessing, we normalized the data in perform- ance counters, and chose a subset under some percentile. Finally, according to the principle of nodes with the same software/hardware envi- ronment and input data had the same output, we proposed a statistical test approach for predicting the fault. The experiments show that, the proposed fault predicting approach based on statistical test has better accuracy and quicker execution time compared with other related resear- ches.