云计算大规模服务器涉及的资源非常丰富,数据量巨大,对其中的故障进行诊断需要大量的科学计算,当前的故障诊断平台通过对故障信息特征的提取实现对服务器故障的诊断,效率极低。因此,设计一种新的云计算大规模服务器故障诊断平台,给出平台的总体结构,详细分析了主控芯片、电源电路、复位电路、无线通信模块和故障诊断模块,共同实现云计算大规模服务器的故障诊断。软件设计中,介绍了故障诊断平台的诊断流程,给出了详细实现过程的代码。用户通过系统的身份验证后,通过平台对服务器进行故障诊断,直至故障排除为止。实验结果表明,采用所设计平台对云计算大规模服务器故障进行诊断,不仅诊断成功率高,而且所需时间较短。
The current fault diagnosis platform has low efficiency,because its diagnosis for server failure is realized bymeans of the extraction of fault information characteristics. Therefore,a new fault diagnosis platform for cloud computing large?scale servers was designed. In this paper,the general structure of the platform is described,the main control chip,power sup?ply circuit,reset circuit,wireless communication module and fault diagnosis module are analyzed in detail,the diagnosis pro?cess of the fault diagnosis platform is introduced,and the code of the implementation process is given. The users can make thefault diagnosis of their servers by means of the platform after passing through the authentication of the system. The experimental re?sults show that the platform has high success rate and needs shorter time for fault diagnosis of cloud computing large?scale server.