为提高太空恶劣环境中电子系统的可靠性,提出了一种具有芯片级在线修复能力的强容错三模冗余(TMR)系统结构及设计方法,可在不影响系统正常工作的前提下实现故障模块的在线修复.该系统采用TMR结构,可实时检测定位故障模块 模块采用组件备份法设计,故障发生时可通过备件切换法快速自修复,模块中每个故障组件均可通过进化进行修复 并通过异构冗余降低2个以上模块同时故障的概率.以具有片内三模冗余的三阶高密度双极性(HDB3)编码器系统设计为例,对系统结构和各种容错修复机制进行了验证,结果表明系统可靠性得到很大提高.
A new system structure and design approach of triple-module redundancy(TMR) systems-on-chip with multi-ply online self-repair mechanisms is proposed in which fault module can be repaired without affectting the system's normal operation.The system is composed of three reconfigurable redundant modules and a control processor,in which fault modules can be detected autonomously;each module is made up of subassemblies with spare parts,and can recover from fault quickly by switching to the spare parts;meanwhile each subassembly in the module can be repaired through evolution;moreover,redundant circuits with different structures are applied to avoid the synchronous arrival of fault at more than 2 modules.The design method and all the self-repair mechanisms are proof-tested by a TMR HDB3 coder system-on-chip.It is shown that the reliability of the system has been enhanced greatly.