数据竞争是引起多核程序发生并发错误的主要原因。针对现有基于硬件的happens-before数据竞争检测方法硬件开销大的问题,提出了一种轻量级的内存竞争硬件检测算法,该算法利用滑动窗口技术动态检测程序执行过程中发生的距离较近、更易引发并发错误的数据竞争。考虑竞争距离的大小,将并发线程片段细分为加锁并发竞争域和包含线程近期执行序列的未加锁并发竞争域,用一对交替移动的可重写滑动窗口保存未加锁并发竞争域内的内存操作指令,用一个大小可变的可重写滑动窗口保存加锁并发竞争域内的内存操作指令,当来自远程的共享访问与窗口内的内存访问发生冲突时,检测到数据竞争。在硬件实现结构中,仅为每个处理器核添加3对较小尺寸的硬件签名寄存器来保存并发竞争域内的数据地址,无需更改原有的cache一致性协议,带来的带宽开销低,能够快速地检测多核程序并发执行过程中发生的动态数据竞争,为多核程序开发和生产运行阶段的并发错误诊断提供有效的指导信息。
Data race is a major factor which causes multi-core programs to produce concurrent bugs. To address the high hardware cost in happens-before detection proposals, a light-weight hardware data race detection approach based on sliding window technology was proposed. It used sliding windows to save recent memory instructions in thread execution and dynamically detected data races with small race distance which more easily lead to concurrent bugs. Considering the race distance, parallel thread segments were subdivided into concurrent race regions with lock and concurrent race regions without lock. A pair of alternate rewritable sliding windows was used to store the memory instructions in concurrent race region without lock, and a sliding window with variable size was used to store the memory instructions in concurrent race region with lock. When there was a conflict between a remote sharing access and memory accesses in sliding windows, a data race was detected. In the hardware implementation, the addresses of the data in sliding windows were automatically encoded into three hardware signatures with small size. Data races can be detected quickly without modifying the L1 cache and cache coherence protocol messages. This approach supplies efficient guidance to help users to diagnose concurrency bugs occurred in the development and production run of multi-core programs, achieving smaller hardware and bandwidth overhead.