基于隐马尔科夫模型(Hidden Markov Model,HMM)提出了状态监测和故障诊断的原理与基本流程。通过观测数据的提取与降维,正常态模型训练与改进,故障态模型训练等一系列措施,实现了两模冗余安全计算机的状态监测,对正常态与时钟偏离1%~10%等7种不同条件进行监测。监测结果表明:对数似然概率均值从-228.98降至-1 385.60,健康状态不断恶化。对1号处理单元(PU1)故障状态进行仿真监测时,将PU1故障与PU1故障态、正常态、安全容错管理单元(FTSM)故障态、通信控制器(CC)故障态以及系统受扰故障态进行比较,得到对数似然概率均值分别为-161.95、-13.72、-14.13、-40.17及-35.69,证明了系统所发生的故障是因PU1所致。监测方法能够有效实现安全计算机健康状态的检测,为铁道信号安全计算机监测技术提供理论支撑。
The principle and primal procedure of condition monitoring and fault detection were proposed based on hidden Markov model (HMM). The condition monitoring for two-mode redundant safety computer was carried out by using a number of ways, including the extraction and dimensionality reduction of observed data, the training and improvement of normal status model, the training of fault status model and so on. 7 different conditions of normal statuses and statuses with 1~//00-10% clock offsets were monitored. Monitoring result shows that average logarithmic likelihood probability reduces from --228.98 to --1 385.60, which indicates the degrading of health status. When the monitoring of PU1 (process unit 1) faults is conducted by simulation, the average logarithmic likelihood probabilities of fault status compared with PU1 fault, normal status, fault tolerance and safety management (FTSM) fault, communication controller (CC) fault, and system interference fault are --161.95, --13.72, --14.13, --40.17 and--35.69, respectively, which verifies that the system fault is resulted from PU1. So the proposed monitoring method is effective in safety computer monitoring, and it will give atheoretical support to the monitoring of railway signal safety computer. 7 figs, 15 refs.